JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 
June 03, 2023, 04:50:09 PM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: [JPPF v3] Default recovery configuration on server  (Read 2226 times)

kilik

  • JPPF Knight
  • **
  • Posts: 16
[JPPF v3] Default recovery configuration on server
« on: May 03, 2012, 10:30:00 AM »

According to the user guide of 3.0.1, the default value of configuration property "jppf.recovery.enabled" on server is "false". Does it mean failover will not take place if remote nodes encounter hardware failure? But actually, I observed the failover by running template application and simulating hardware failure on a node. So it makes me very confused about this configuration property.

Thanks for your time,
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: [JPPF v3] Default recovery configuration on server
« Reply #1 on: May 04, 2012, 08:43:30 AM »

Hello,

Sorry for the confusion. What we mean by hard failures is a category of crashes or problems that leave network connections in an inconsistant state.

For instance, if you kill a node, using "kill -9 <pid>" then the node crashes but the OS is aware of it and cleanly closes the network connections it had opened. This is not a hard failure. Furthermore, the jppf server will detect it almost immediately, and recovery of jobs running on that node can take place.

On the other hand, if you physically pull the network cable from the machine where the jppf node is running, then the OS on the jppf server machine will not know about it (unless you're willing to wait for a very long time, usually hours at the minimum). It will still believe the node is up. This is a hard failure, which requires a separate heartbeat mechanism to detect it. This mechanism is what we call "recovery" in the JPPF configuration.

I hope this clartifies.

Sincerely,
-Laurent
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: [JPPF v3] Default recovery configuration on server
« Reply #2 on: May 04, 2012, 08:54:36 AM »

Hello again,

To add to my previous statements, this is specified in the TCP protocol functional specifications at http://www.ietf.org/rfc/rfc793.txt
In particular pages 26 & 27 of the specs.

-Laurent
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads