adequate
adequate
adequate
adequate
 

JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   Forums 
June 25, 2018, 12:05:51 AM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: elapsed time for triple duty physical servers  (Read 379 times)

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
elapsed time for triple duty physical servers
« on: April 12, 2017, 03:22:18 AM »

My application has long running tasks. Each task takes the same amount of time which might be 200 hours in practice or 1 minute if it is a test program. I reasoned that a physical machine that is running the driver and application has nothing to do while waiting for nodes. If it has nothing to do it would be economical to run a node on that same physical machine and connect it to the localhost driver. I am using JPPF 5.1.2.

The configuration differences compared to "stock" are listed here. Note that there will be minor modifications to this configuration because I have scripts which edit "localhost" to an IP address if necessary and I have scripts which change -XmxVALUE into -Xmx999 or whatever. The differences output is from the Linux diff utility.

APPLICATION DIFFERENCES...
87c87
< #jppf.ssl.enabled = true
---
> jppf.ssl.enabled = false
139c139
< #jppf.discovery.enabled = true
---
> jppf.discovery.enabled = false
--------------------------
 
DRIVER DIFFERENCES...
33,34c33,34
< jppf.ssl.server.port = 11443
< #jppf.ssl.server.port = -1
---
> #jppf.ssl.server.port = 11443
> jppf.ssl.server.port = -1
127c127
< jppf.load.balancing.algorithm = proportional
---
> jppf.load.balancing.algorithm = nodethreads
130c130
< jppf.load.balancing.profile = proportional_profile
---
> jppf.load.balancing.profile = nodethreads_profile
163c163
< jppf.jvm.options = -Xmx256m -Djava.util.logging.config.file=config/logging-driver.properties
---
> jppf.jvm.options = -XmxVALUE -Djava.util.logging.config.file=config/logging-driver.properties
--------------------------
 
NODE DIFFERENCES...
25c25
< #jppf.server.host = localhost
---
> jppf.server.host = localhost
28c28
< #jppf.server.port = 11111
---
> jppf.server.port = 11111
51c51
< #jppf.ssl.enabled = true
---
> jppf.ssl.enabled = false
162c162
< jppf.jvm.options = -Xmx128m -Djava.util.logging.config.file=config/logging-node.properties
---
> jppf.jvm.options = -enableassertions -XmxVALUE -Djava.util.logging.config.file=config/logging-node.properties
--------------------------

I find that if each task takes 4 hours (in a test application), the physical server that is doing triple duty (node, driver, application) does not complete after 6 hours. I do not know the ultimate elapsed time to completion because I terminated the tasks but it appears one physical server needs at least 150% as much time. Using a test application with a short task duration, I see that most nodes complete in about 60 seconds but the physical server doing triple duty takes about 110 seconds, so one physical server needs almost 200% as much time. Is this normal? There is no swap activity according to vmstat. There is ample RAM while the heap high water mark (measured at a 1 second period) is very much less than that.
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2223
    • JPPF Web site
Re: elapsed time for triple duty physical servers
« Reply #1 on: April 12, 2017, 07:16:19 AM »

Hello,

One possible explanation that comes to my mind is if you have different hardware on the regular nodes and the "triple-duty" machine. It is possible that this machine has less cores, slower cores, or both. Could you tell us if that's the case?

Another possibility is if the tasks need to access remote services, for instance a database but it could really be anythig else, and the regular nodes and the driver machine are on separate networks, resulting in slower network I/O for that machine.

Also, the distribution of the tasks within jobs could make a difference. Could you tell us how many tasks each job is holding, and how many cores each node machie has? I see in your driver configuration that you are using the "nodethreads" load-balancing algorithm, so I'm assuming each node will receive at most nbCores tasks at any given time.

Sincerely,
-Laurent
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: elapsed time for triple duty physical servers
« Reply #2 on: April 12, 2017, 08:17:17 AM »

The machines are identical except the triple duty machine has more RAM.
The machines are all on the same local subnet and all plugged into the same ethernet switch.
The nodes operate only on data they received in the task so there is no data access to a database.
The nodes do no internet queries.
After a node receives the data (as part of the task) there is no need to communicate until the end of the task (other than any JPPF communications that I would not be aware of).
There is no need to write to the hard disk except for trivial things like logging the heap high water mark.

The job has 55 tasks. The 55 tasks are sent to 7 servers with 8 cores each. Most servers get 8 tasks and one server gets 7 tasks. A node with 7 tasks reports 87% CPU usage. This suggests there are 7 busy cores of the 8 total. A node with 8 tasks reports 99% CPU usage so all 8 cores are busy. This implies that a 7 task server should finish at the same time as an 8 task server. Which server gets 7 tasks seems random to me and is likely determined by JPPF.

This phenomenon where the triple duty server takes longer was NOT a problem when I ran a job with 16 tasks sent to 2 servers (one triple duty server with 8 tasks plus one node server with 8 tasks). In other words the triple duty server finished 8 tasks at approximately the same time the node server finished 8 tasks.

That this phenomenon appears when there are 7 servers (one triple duty server, six node servers, 55 tasks) was surprising.

EDIT: I have run a 4 hour test with all regular node servers; no triple duty servers. That is, the driver and application are on a separate machine. The machine that seemed slow before is still slow now, even though it is only a node. Thus the problem may be specific to the machine; not specific to doing triple duty. I will wait another 30 minutes. If at that time the slow machine is not finished it will seem highly likely to be specific to the machine.
« Last Edit: April 12, 2017, 08:46:43 AM by broiyan »
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2223
    • JPPF Web site
Re: elapsed time for triple duty physical servers
« Reply #3 on: April 12, 2017, 08:39:10 AM »

Thanks for the information. I guess we'll need more information to figure out what's happening.

My suggestion is to set the DEBUG logging level for the relevant parts of the JPPF code, so that the timestamping in the logs allows us to understand which part is taking longer or triggered later than expected. The settings I would suggest are as follows:

1. In at least one of the node servers and in the node on the triple-duty server, add the following in config/log4j-node.properties:
Code: [Select]
log4j.logger.org.jppf.execute=DEBUG
log4j.logger.org.jppf.node=DEBUG
log4j.logger.org.jppf.server.node=DEBUG
log4j.logger.org.jppf.utils.JPPFDefaultUncaughtExceptionHandler=DEBUG

2. In the driver's config/log4j-driver.properties, add these properties:
Code: [Select]
log4j.logger.org.jppf.server.nio.nodeserver=DEBUG
log4j.logger.org.jppf.utils.JPPFDefaultUncaughtExceptionHandler=DEBUG

Could you please do these modifications, restart the driver and nodes, run another test and send over the resulting log files?

Thanks,
-Laurent
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: elapsed time for triple duty physical servers
« Reply #4 on: April 12, 2017, 09:05:53 AM »

Thanks for the information. As I mentioned in the edit, I now have reason to believe there is a machine specific hardware problem. If changing the machine does not solve it, I will collect the data you asked for.
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: elapsed time for triple duty physical servers
« Reply #5 on: April 13, 2017, 06:20:01 AM »

The slow node is now being run stand-alone (no longer triple-duty) and it is still slow.
The cluster consists of 1 server (driver and application) and 1 server (16GB node with Xmx15000M) and 1 server (52GB node with Xmx15000M).

The difference between the fast and slow versions of jppf-node.log is that the slow one has this line:

2017-04-12 23:57:20,039 [INFO ][org.jppf.classloader.ClassLoaderRequestHandler.run(155)]: maxBatchSize = 2

Three files are attached.

Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2223
    • JPPF Web site
Re: elapsed time for triple duty physical servers
« Reply #6 on: April 13, 2017, 08:19:11 AM »

Hello,

Thank you very much for providing the logs.

Indeed, the "maxBatchSize = 2" message is the one difference between the two nodes. Note that it appears between these 2 lines:
Code: [Select]
2017-04-12 23:57:19,883 [DEBUG][org.jppf.execute.AbstractExecutionManager.execute(154)]: executing 8 tasks
2017-04-12 23:57:20,039 [INFO ][org.jppf.classloader.ClassLoaderRequestHandler.run(155)]: maxBatchSize = 2
2017-04-13 00:02:30,082 [DEBUG][org.jppf.server.node.JPPFNode.processResults(253)]: processing 8 task results for job 'ohanaJob'
This is a good indicator that it happens while the tasks are being executed in the node, as opposed to being read from the network connection or deserialized by the node. This message could be an indication of increased class loading activity initiated by the tasks and means that a new highest number of concuirrent class or reource loading requests has been reached.

One thing worth checking is if there are differences in the classpath of the regular and slow nodes. If that doesn't yield any conclusive result, then I would suggest to enable debug-level logging for the class loading:

for a node, add this property in log4j-node.properties:
Code: [Select]
log4j.logger.org.jppf.classloader=DEBUGfor the driver, add this property in log4j-driver.properties:
Code: [Select]
log4j.logger.org.jppf.server.nio.classloader=DEBUG
These will generate a large amount of additional logging, let's hope it will provide us with helpful information as well.

-Laurent
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: elapsed time for triple duty physical servers
« Reply #7 on: April 14, 2017, 12:53:19 AM »

I have been unable to post to this forum with attachments so I will try to post this abbreviated reply.

The fast (16GB) and slow (52GB) nodes are deployed via the same rsync command so there should be no differences in the class path.

The major difference in the jppf-node.log files is that the slow node has this, early in the file.

2017-04-13 17:55:51,853 [DEBUG][org.jppf.node.connection.JPPFDefaultConnectionStrategy.discoverDriver(55)]: Could not auto-discover the driver connection information
2017-04-13 17:55:51,854 [DEBUG][org.jppf.classloader.AbstractClassLoaderConnection.init(78)]: initializing connection
2017-04-13 17:55:51,879 [DEBUG][org.jppf.classloader.AbstractClassLoaderConnection.initChannel(134)]: initializing socket connection
2017-04-13 17:55:56,889 [DEBUG][org.jppf.node.NodeRunner.main(142)]: received reconnection notification : org.jppf.JPPFNodeReconnectionNotification: the JPPF class loader could not reconnect to the server
   at org.jppf.classloader.RemoteClassLoaderConnection.init(RemoteClassLoaderConnection.java:84)
   at org.jppf.classloader.AbstractJPPFClassLoaderLifeCycle.init(AbstractJPPFClassLoaderLifeCycle.java:127)
   at org.jppf.classloader.JPPFClassLoader.<init>(JPPFClassLoader.java:41)
   at org.jppf.node.NodeRunner$2.run(NodeRunner.java:254)
   at org.jppf.node.NodeRunner$2.run(NodeRunner.java:251)
   at java.security.AccessController.doPrivileged(Native Method)
   at org.jppf.node.NodeRunner.getJPPFClassLoader(NodeRunner.java:257)
   at org.jppf.node.NodeRunner.createNode(NodeRunner.java:189)
   at org.jppf.node.NodeRunner.main(NodeRunner.java:138)

The jppf-driver.log file has only 1 WARN. The WARN happens when the slow node is finishing; the fast node has long finished. This is about a 5 minute task and the fast node takes about half that time.

2017-04-13 18:05:28,664 [WARN ][org.jppf.nio.StateTransitionTask.run(89)]: error on channel SelectionKeyWrapper[id=12, readyOps=1, interestOps=0, context=ClientContext[channel=SelectionKeyWrapper[id=12], state=WAITING_JOB, uuid=A28E913E-95CB-DD3C-1812-5039586F5F5E, connectionUuid=A28E913E-95CB-DD3C-1812-5039586F5F5E_1, peer=false, ssl=false], nbTasksToSend=0, completedBundles={[]}] : java.lang.NullPointerException: null
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2223
    • JPPF Web site
Re: elapsed time for triple duty physical servers
« Reply #8 on: April 14, 2017, 08:40:47 AM »

Hello,

Quote
I have been unable to post to this forum with attachments so I will try to post this abbreviated reply.
This is due to the attachments settings for these forums. They were set to a max size of 100 kb, which is small. I changed the settings to 1024 mb, so it should now fit if you zip your logs.

The warning you are getting at the end is triggered by the client disconnecting from the driver, so you can safely ignore it. It happens after the slow node finishes because this node is the last one to return results for the submitted job.

I examined the logs you sent, but unfortunately the class loading logs do not provide a clue as to where the problem is. The class loading requests on both nodes all happen within a similar time frame, and the only gaps in the logs occur after all required classes have been loaded: around 2mn30s for the regular node and almost 5mn for the slow one.

What I can suggest is to "benchmark" the performance of each node, by sending a job with identical tasks that will perform cpu-bound operations, mesure the elapsed time and return this time along with some information that identifies the node on which it was executed. Something like this:

Code: [Select]
public class MeasurementTask extends AbstractTask<String> {
  public long elapsed;
  public String nodeUuuid;
  public String ip;

  @Override
  public void run() {
    // gather information to help identify the node
    this.nodeUuuid = NodeRunner.getUuid();
    JPPFSystemInformation info = NodeRunner.getNode().getSystemInformation();
    ip = info.getNetwork().getString("ipv4.addresses");
    long start = System.nanoTime();
    // busy loop that won't be optimized out by the compiler
    double sum = 0d;
    for (int i=1; i<=500_000_000; i++) {
      sum += Math.sqrt(Math.log(i));
    }
    elapsed = (System.nanoTime() - start) / 1_000_000L;  // converted to millis
    String s = String.format("elapsed = %,d ms, sum = %f, nodeUuid = %s%nipv4 = %s", elapsed, sum, nodeUuuid, ip);
    setResult(s);
  }
}

For your convenience, I have attached the code for this task, along with sample client application that uses it and which you can use as a starting point or just as is.
Could you try running this on a grid with one regular node and the slow one, and let us know if there is any significant difference in performance?

Thanks,
-Laurent
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: elapsed time for triple duty physical servers
« Reply #9 on: April 15, 2017, 05:25:51 AM »

A typical result from the CPU-bound test:
----------------------------------------------------

stats for node 777A9ACC-9BA5-4392-445C-A099272FB218: avg=16,267; count=8; min=16,243; max=16,284; ips=192.168.2.246|192.168.2.246 localhost|127.0.0.1
stats for node A9C837E1-08FA-B700-8C0B-D8A9EB636192: avg=16,284; count=8; min=16,211; max=16,303; ips=192.168.2.251|192.168.2.251 localhost|127.0.0.1

There is not much difference in average elapsed time. By multiple runs it was shown that either unit can be the fast or the slow unit and the difference is usually small. Sometimes the difference is more than 10% but it seems random which is the slow unit.

Running my test application (independent driver+application unit, no triple-duty):
---------------------------------------------------------------------------------------------------------

The slow node unit with 52GB was replaced with another unit with 48GB. (Most of the DIMM memory was moved from one unit to another so it can be considered as a motherboard swap; not a complete server swap. memtest86+ never complained about this memory albeit I only ran it for about 5 minutes which I suspect is adequate.)  The problem persists.

Experiment by using a different ethernet physical port at the switch where the cable goes to the slow unit. The slow unit now has a different IP but the problem persists.

Note there is a strong preference for one node to have more physical memory than the other nodes because of the intention to eventually use one unit for triple-duty. For debugging we can temporarily remove this preference. The slow unit had 48GB so I removed some memory to leave it with 16GB. Thus a test was performed where both node units have 16GB. The problem persists.

A typical result with -Xms15500M -Xmx15500M where both nodes are 16GB units:
fast: 55 seconds
slow: 106 seconds

Hypothesis: The JPPF driver has some "memory" of a node being slow so it might treat it differently.
The Experiment: Delete the driver deployment and redeploy a clean copy.
The Finding: The problem persists.

The next test will be to change the DIMMs.

EDIT: It seems 2 DIMMs of 8GB (16GB total) runs slower than 8 DIMMs of 2GB (also 16GB total). It also seems that 6 DIMMs of 8GB (48GB total) runs slower than 8 DIMMs of 2GB (16GB total).  One or more unpopulated memory slots is correlated with the problem. Experimentally it was shown that it is better that the 8 memory slots be populated identically which is MORE STRICT than saying the DIMMs that are present must be identical. The 52GB experiment (6 DIMMs of 8GB and 2 DIMMs of 2GB) can be viewed as a problem of non-identical DIMMs but preferrably as a problem of non-identically populated slots.

« Last Edit: April 15, 2017, 11:00:31 PM by broiyan »
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2223
    • JPPF Web site
Re: elapsed time for triple duty physical servers
« Reply #10 on: April 18, 2017, 06:48:33 AM »

Hello,

Thanks for the tesst results. The very close processing times obtained from the CPU-bound test show that there is no significant difference in CPU speed on both machines.

Regarding the hypothesis that the JPPF driver has some "memory" of a node being slow so it might treat it differently, I think we can exclude this from the possible causes. In effect, the nodes logs you provided earlier clearly show that the slower processing occurs strictly within the nodes themselves. This can be deduced from the timestamps of the log messages that indicate the start and end of the tasks execution in the nodes:

Slow node:
Code: [Select]
2017-04-13 18:00:11,858 [DEBUG][org.jppf.execute.AbstractExecutionManager.execute(154)]: executing 8 tasks
...
2017-04-13 18:05:22,378 [DEBUG][org.jppf.server.node.JPPFNode.processResults(253)]: processing 8 task results for job 'ohanaJob'

Regular node:
Code: [Select]
2017-04-13 18:00:11,360 [DEBUG][org.jppf.execute.AbstractExecutionManager.execute(154)]: executing 8 tasks
...
2017-04-13 18:02:59,308 [DEBUG][org.jppf.server.node.JPPFNode.processResults(253)]: processing 8 task results for job 'ohanaJob'

As we can see, the executions in the nodes start within 500 ms of each other, whereas in the slow node it finishes 2mn 23s after the regular node.

The one way the driver could treat the nodes differently would be via the load-balancer, by assigning more tasks to a node than to the other, but here we know  it's not the case, since they each receive 8 tasks.

A hypothesis that would explain the behavior you observe, is if one of the tasks, possibly more, takes much longer than the others to execute. For instance, the last task added to the job, such that whatever node it is dispatched to, this node will appear as the slowest. Is there any  way you could instrument your tasks to capture their elapsed time, similarly to the cpu-bound test?

Another test I was thinking of: could you try with both nodes on the same machine, see if there still is a difference in execution time?

-Laurent
Logged

broiyan

  • JPPF Grand Master
  • ****
  • Posts: 54
Re: elapsed time for triple duty physical servers
« Reply #11 on: April 18, 2017, 06:55:54 PM »

Laurent, Thank you for your help. Running two nodes on one machine is a good idea and I will probably get around to trying that. In the meanwhile, I consider this problem solved.  Pairs of DIMMs must be identical but not only that the user manual says that "best system performance" is achieved with 4 identical DIMMs or 8 identical DIMMs and 6 DIMMs is not allowed. With the 52GB (6x8GB + 2x2GB) set up the 8 DIMMs are not identical even if pairs are identical. With the 48GB (6x8GB) set up the DIMMs are identical but 6 DIMMs is a disallowed configuration. If this is the cause, I find the magnitude of the performance difference surprising since each DIMM is presumably adhering to the standard with respect to bus speed and latency. Perhaps the manual should not suggest that you will fail to achieve "best system performance" and instead say that you will definitely experience very poor performance.

There are no abnormal individual tasks as far as I can see. Every task should take the same amount of time. Since they all say "hello" and "goodbye" to the node console I can see whether or not that is the case. They say "hello" at the same time intra-node and across nodes. On the slow server all 8 tasks say "goodbye" after ~5 minutes instead of all 8 saying goodbye after ~3 minutes. This means the 8 tasks on each node are identical as I expected.

Two questions: If the DIMM mismatch problem is eliminated there will still remain some across node difference. Suppose it is 3%.  In 200 hours 3% is 6 hours. This means the fastest server would be doing nothing for 6 hours and wasting electricity. Will the driver that is still waiting on slower nodes not mind if I Control-C each node that is finished and power off? Is Control-C the best way? In retrospect I should have made many smaller tasks instead of 200 hour tasks.
« Last Edit: April 19, 2017, 08:12:41 AM by broiyan »
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2223
    • JPPF Web site
Re: elapsed time for triple duty physical servers
« Reply #12 on: April 19, 2017, 06:42:51 PM »

Hello,

Quote
Will the driver that is still waiting on slower nodes not mind if I Control-C each node that is finished and power off?

The driver is actually designed to handle these situations. If any problem occurs, then it's a bug and we will fix it. The driving philosphy in JPPF is that nodes leaving or joining the grid are part of the grid's life cycle, not exceptions. Without that, it is very difficult and costly to build robust massively distributed applications. My understanding is that Ctrl-C is the equivalent of a kill -2 or SIGINT signal or a System.exit(x) in Java, and the driver handles that very well. The cleanest way to terminate a node would be via the management APIs, or the administration console (which uses these APIs).

This could even be automated, for example using a node lifecycle listener to cause a node to terminate itself when it hasn't received any task after a specified time.

-Laurent
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads