JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 
November 18, 2019, 07:55:01 AM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: Handling Parallel requests when JPPF server or JPPF node not present  (Read 3974 times)

task-manager.co.in

  • Guest

Hi,
I am setting up a JPPF system with two linux machines
System 1: JPPF driver and the client will be running
System 2: is the only one JPPF node available to process the requests

The idea is only System 2 must be processing all the requests and reduce the load on System 1.

Now I am planning to cover up below scenario
  1. If there is no JPPF node available to process the request, it should process the request in the local(System 1) machine
  2. Same way if the JPPF driver is not running, still the request should get processed in the local/client machine

While going through the configuration file I found a property - jppf.local.execution.enabled = true, as per the documents once it is set to true if there is no nodes/driver available requests will get executed in the local machine. (This works fine)
If Driver and node present in the topology the request will be distributed - unfortunately once after setting jppf.local.execution.enabled = true I am only seeing requests getting processed in local machine and not getting distributed if there are nodes present.

Am I doing something wrong, or is there any better way I can achieve this configuration ( process request locally if there is no driver/node present, and distribute requests if there are driver & nodes present)

Thanks in advance,
- task-manager.co.in
« Last Edit: May 11, 2011, 08:33:53 AM by task-manager.co.in »
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2256
    • JPPF Web site

Hello,

The behavior you are observing when local execution is enabled is most likely due to the load-balaning that takes place, between remote and local execution: the load balancer will distribute the tasks in a job based on how fast the tasks executed in the past. The inital ratio (for the first job) is 50% local vs. 50% remote, then the load balancer will adapt to the measured performance of the tasks. So, if your tasks are short-lived, it will end up executing most of them locally, because the overhead of remote execution becomes very significant compared to the tasks execution time.

The other issue is that "jppf.local.execution.enabled" is a static property, which is set when the JPPF client starts and cannot be unset during its life cycle. This means that, if a driver becomes available, part of the requests will still be processed locally. Based on this, I have registered a feature request to enable dynamic toggling of local execution: 3300844 - Ability to toggle local execution dynamically. This feature will be available in the next release, JPPF v2.5 (coming soon).
The goal would be to toggle it off when you detect that a driver and node are available, and back off when either driver or node are unavailable.
The case where a driver connection becomes unavailable is automatically handled by JPPF (if local execution is enabled) and JPPF will switch to local mode automatically.

Another possibility, if this is acceptable for you, would be to have 2 drivers one System1 and System2, each with one node specifically attached to it, and have the JPPF client use them in faillover mode, with the prirority given to the driver on System2.
To do this, you will have to disable server discovery in the nodes and in the client, and configure the drvier connections manually.

For instance, let's says Node1 is attached to the driver in System1, Node2 is attached to System2. You'd configure the following for Node1:
Code: [Select]
jppf.discovery.enabled = false
jppf.server.host = System1
class.server.port = 11111
node.server.port = 11113

Same for Node2:
Code: [Select]
jppf.discovery.enabled = false
jppf.server.host = System2
class.server.port = 11111
node.server.port = 11113

Now the interesting part is the configuration of the client:

Code: [Select]
jppf.drivers = driver1 driver2

driver1.jppf.server.host = System1
driver1.class.server.port = 11111
driver1.app.server.port = 11112
# we give a lower prority than for System2, to ensure
# requests go to System1 only if System2 is unavailable
driver1.priority = 0

driver2.jppf.server.host = System2
driver2.class.server.port = 11111
driver2.app.server.port = 11112
# we give a higher prority than for System1
driver2.priority = 10

Here we use the ability to set the priority on a driver connection, to tell the JPPF client that the driver with the highest priority will get the requests, and if it becomes unavailable, then the driver with the next highest priority will get the requests. This allows you to define a failover strategy on the client side. On the other hand, if the priorities of the two drivers are equal, then JPPF will pool the connections and distribute requests in a round-robin fashion.

I hope this helps.

Sincerely,
-Laurent
Logged

task-manager.co.in

  • Guest

Hi Laurent,

Thanks a million for you support.

I think your second recommendation is perfect for our setup, i.e have two drivers and one node connected to each driver.
Let me setup the same configuration and run through our task execution test cases with all fail over scenarios

If you can help me to understand what will be the behavior during below test cases, it will be great help for me to verify.

>Client running on System1

> my understanding in bracket.


System2
1. Driver2 and Node2 running - ( Node 2 will process the request)
1. Node 2 is down - (Node 1 will process the request)
2  + Driver 2 is also down - (Node 1 will process the request)

System1

3. + Node 1 is also down - (Process request locally)
4  + Driver 1 also down - (Process request locally)
Please correct me if I am wrong.

Just to make sure, if there is no node/driver present then JPPF will locally process requests without any configuration changes??

Thanks in advance,
-task-manager.co.in



Logged

task-manager.co.in

  • Guest

Hi,

I tried to execute the failure scenarios with the configuration you suggested

System 1 - Driver 1, Node 1
System 2 - Driver 2, Node 2 ( high priority)

System 3 - Client
Test 1
Driver 1 , Node 1, Driver 2, Node 2 running
Result - requests are getting processed in System 2 - Node 2

Closed/stopped Node 2:

Result: No more requests getting processed

Error from Client log


2011-05-12 17:58:01,590 [WARN ][org.jppf.client.ClassServerDelegateImpl.run(149)]: [driver1] caught java.io.EOFException, will re-initialise ...
java.io.EOFException
   at java.io.DataInputStream.readInt(Unknown Source)
   at org.jppf.comm.socket.AbstractSocketWrapper.receiveBytes(AbstractSocketWrapper.java:194)
   at org.jppf.client.AbstractClassServerDelegate.readResource(AbstractClassServerDelegate.java:116)
   at org.jppf.client.ClassServerDelegateImpl.run(ClassServerDelegateImpl.java:114)
   at java.lang.Thread.run(Unknown Source)
2011-05-12 17:58:09,278 [ERROR][org.jppf.client.AsynchronousResultProcessor.run(87)]: [driver1] Software caused connection abort: socket write error
java.net.SocketException: Software caused connection abort: socket write error
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(Unknown Source)
   at java.net.SocketOutputStream.write(Unknown Source)
   at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
   at java.io.BufferedOutputStream.flush(Unknown Source)
   at java.io.DataOutputStream.flush(Unknown Source)
   at org.jppf.comm.socket.AbstractSocketWrapper.writeInt(AbstractSocketWrapper.java:154)
   at org.jppf.io.IOHelper.sendData(IOHelper.java:223)
   at org.jppf.client.AbstractJPPFClientConnection.sendTasks(AbstractJPPFClientConnection.java:217)
   at org.jppf.client.loadbalancer.LoadBalancer$RemoteExecutionThread.run(LoadBalancer.java:337)
   at org.jppf.client.loadbalancer.LoadBalancer.execute(LoadBalancer.java:179)
   at org.jppf.client.AsynchronousResultProcessor.run(AsynchronousResultProcessor.java:75)
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
   at java.util.concurrent.FutureTask.run(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
Driver/Node/Client configurations and client program is attached with this.

task-manager.coin
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2256
    • JPPF Web site

Hello,

Thanks a lot for this detailed information. I was able to reproduce the same issue on my side.
I believe you have hit a bug in JPPF, which I registered in our bugs tracker:  3301406 - Client failover is not working
Our intent is to fix this bug for the JPPF v2.5 release, which is planned for next week. I apologize for the inconvenience.

Sincerely,
-Laurent
Logged

task-manager.co.in

  • Guest

Thanks a million for looking into this issue
I can give a try once 2.5v is released form you end.

I also noticed one another issue.

System1 - Driver and one Node running
System2 - One node running which is connected to Driver running on System1

Now once both nodes goes down (now only Driver on System1 running), then requests are not getting processed.
Is this issue related to the one you mentioned?

Thanks in advance,
task-manager.co.in
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2256
    • JPPF Web site

Hi,

In fact this behavior is the expected one. If a driver doesn't have any node attached, then it will just keep the jobs in queue until a node becomes available. This is not related to the previous problem you reported. To mitigate this, there some things you might consider doing:

1) You can setup the job to expire after a given time, so it won't stay too long in the queue, for instance with code like this:
Code: [Select]
JPPFJob job = ...;
// set the job to expire after 1 minute
JPPFSchedule schedule = new JPPFSchedule(600000L);
job.getJobSLA().setJobExpirationSchedule(schedule);

When the job expires, it is cancelled by the driver and you can write code on the client side to handle this

2) You can run a local node, in the same JVM as the driver. This is done by specifiying "jppf.local.node.enabled = true" in the driver's configuration file. This node will work exactly as a standalone node, except that it doesn't use the network when communicating with the driver, and trying to shut it down or restart it via the management APIs or console will in fact kill the driver as well.
The interest of doing this is that if the JVM crashes, then both driver and node die and the client will detect it, so its failover mechanism will take over - once we have fixed it  :)

I hope this helps.

Sincerely,
-Laurent
Logged

task-manager.co.in

  • Guest

Thanks Laurent

I am waiting for JPPF release 2.5  :)

Regards,
task-manager.co.in
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads