adequate
adequate
adequate
adequate
 

JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   Forums 
September 21, 2018, 03:00:02 AM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: the importance of the heap size of DriverLauncher  (Read 397 times)

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
the importance of the heap size of DriverLauncher
« on: March 20, 2017, 02:17:51 AM »

By default the -Xmx value in startDriver.sh for DriverLauncher is, 32M. I am using JPPF 5.1.2. I have the following failure in the Linux console of the driver.

Quote
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000004c7f80000, 226492416, 0) failed; error='Cannot allocate memory' (errno=12)
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 226492416 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/brian/driverDeployment/driver/hs_err_pid2287.log
process exited with code 1

The hotspot error file contains the following and more.

Quote
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 226492416 bytes for committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2627), pid=2287, tid=0x00007fcbfcfb4700

My inclination at this point is to add physical memory and to also increase the swap space. I am posting on this forum to see if there is any reason at all to expect that -Xmx32M also might need to be revised. My initial belief is that the 32M value is not relevant because it affects the DriverLauncher, which by its name or superficially does not seem to be an object that is affected by the size of objects being processed. It merely launches. Since the application takes about 8 days to run it would be good to minimize trial and error so I am posting here for a recommendation that may help me reduce trials.
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2240
    • JPPF Web site
Re: the importance of the heap size of DriverLauncher
« Reply #1 on: March 20, 2017, 05:50:48 AM »

Hello,

Yes, as you guessed, DriverLauncher merely launches and monitors the actual JPPFDriver java process. As such, 32MB is probably slightly overkill for it, but better sure than sorry.
Furthermore, the java dump file indicates that an allocation of 226,492,416 bytes (216 MB) failed, thus I don't think it's the DriverLauncher process that crashes, but the driver JVM that it tries to launch.
It looks like you don't have enough RAM or swap space left, either because there isn't much on this machine, or because many already runninig applications are using it.

I hope this helps,
-Laurent
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: the importance of the heap size of DriverLauncher
« Reply #2 on: March 20, 2017, 07:15:21 AM »

Thank you for the reply.

The crash is towards the end of the 8 day execution, probably when results are being sent back from the nodes to the driver/application. This code is known to work with an object graph that I consider to be 4 units of data (and takes 10,400 seconds). With 200 units of data the expectation, linearly speaking, was 144 hours (6 days) but of course not everything scales linearly so I was not surprised when it exceeded 6 days.

I log with a 1 second period the high water mark of the heap used at each node and also at the application. As far as I know there is no need to take a heap high water mark at the driver and your reply confirms that. The Xmx value of the application is about 15.5GB (physical is 16GB) and after the crash the high water mark is only about 8.2GB (much less than Xmx and much less than physical). This could be wrong since the logging is only once per second.  Aside: Each node's heap high water mark is about 6.5GB (and Xmx is 8GB and physical is 8GB.)

As it is, my heap high water mark at the application does not provide a clue as to how much heap space or physical memory I need to add. In fact, looking at the high water mark it doesn't seem that I need more heap. So my attention turns to if the stack of the application is adequate or the stack of the driver is adequate or if JPPF is memory hungry (on a stack? or on a heap?) when receiving results. I don't know if JPPF's heap use is reflected in my application heap high water mark or if there is another heap behind the scene belonging to JPPF.

Besides the application heap size is there any constraint in JPPF that would have a problem with a "result" that is a large graph of objects ("inbound" or in other words from the nodes to the driver/application)?

Lastly: Since I have to wait 8 days, obviously communications bandwidth and timeliness is not a concern so my problem has me considering switching to a different result passing method. I could have each node write each result as a file to the hard disk at the node. Then the JPPF "result" is just a boolean that says it is finished. Then I would implement some way for the application to fetch the file from each node. I would consider this alternative if JPPF is not well suited for passing back results. I'm not familiar with the internal workings of JPPF so I would appreciate any comment on the suitability of what I am doing or the suitability of the alternative.
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2240
    • JPPF Web site
Re: the importance of the heap size of DriverLauncher
« Reply #3 on: March 20, 2017, 09:34:41 AM »

Hello,

Maybe a little background could help here? If you take a look at the communication protocol between dirvers, nodes and clients, you can see that any network message is made of one or more serialized object graphs, each preceded by its size. The size is an integer, so a strong constraint here is that a serialized object graph cannot exceed Integer.MAX_VALUE bytes. However, we have specific code to detect overflows and process them by throwing a specific exception. In the case of a node sending a task back to the server, this results in an instance of JPPFExceptionResult being sent in place of the original task, so even though the task cannot be sent back, you have a chance to understand why.

Some memory usage considerations:

- while objects are being deserialized, their memory usage may be considered as doubled, even if only temporarily. This applies mostly in the nodes and clients, since the driver does not serialize/deserialize the tasks (nor the data provider, if any). This is important especially in the nodes, since nodes serialize and deserialize multiple tasks in parallel, using the configured processing thread pool. The serialization/deserialization can be sequentialized instead by setting the configuration property "jppf.sequential.deserialization = true".

- the driver has memory-sensitive processing which makes decisions as to whether it will keep serialized tasks in memory or offload them to disk, based on their size and the available heap. There are 3 configuration properties that can be used to tune this behavior:
  • jppf.disk.overflow.threshold: the ratio of available heap over the size of an object to deserialize, below which disk overflow is triggered. Default: 2.0
  • jppf.gc.on.disk.overflow: whether to call System.gc() and recompute the avalaible heap size before triggering disk overflow. Default: true
  • jppf.low.memory.threshold: the minimum heap size in MB below which disk overflow is systematically triggered, to avoid heap fragmentation and ensure there's enough memory to deserialize job headers. Default: 32 MB

This being said, the symptoms you posted are indicative of a native memory allocation problem, not a Java heap exhaustion issue. Thus I would look for issues in the configuration of your environment/OS. In particular, if running in a virtual machine, it is possible that no swap file is defined or that it is insufficiently sized. A possible way to test that, at driver startup, would be to set a -Xms with the same value as the -Xmx (in the jppf.jvm.options configuration property).

Sincerely,
-Laurent
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads