Hello,
Sorry for the confusion. What we mean by hard failures is a category of crashes or problems that leave network connections in an inconsistant state.
For instance, if you kill a node, using "kill -9 <pid>" then the node crashes but the OS is aware of it and cleanly closes the network connections it had opened. This is not a hard failure. Furthermore, the jppf server will detect it almost immediately, and recovery of jobs running on that node can take place.
On the other hand, if you physically pull the network cable from the machine where the jppf node is running, then the OS on the jppf server machine will not know about it (unless you're willing to wait for a very long time, usually hours at the minimum). It will still believe the node is up. This is a hard failure, which requires a separate heartbeat mechanism to detect it. This mechanism is what we call "recovery" in the JPPF configuration.
I hope this clartifies.
Sincerely,
-Laurent