JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
bug_report_small.png
CLOSED  Bug report JPPF-359  -  Node unable to reconnect when connection is closed from a separate thread
Posted Jan 14, 2015 - updated Jan 14, 2015
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
     
    Closed
  • Assigned to
     lolo4j
  • Progress
       
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     lolo4j
  • Owned by
    Not owned by anyone
  • Category
    Node
  • Resolution
    RESOLVED
  • Priority
    Normal
  • Reproducability
    Not determined
  • Severity
    Normal
  • Targetted for
    icon_milestones.png JPPF 4.2.6
Issue description
When testing the disconnection of a node while it is executing a job, I noticed the following 2 exceptions in the log, where the second exception repeats infinitely:
2015-01-14 08:28:41,945 [ERROR][main                ][org.jppf.server.node.JPPFNode.run(146)]: 
java.lang.NullPointerException
  at org.jppf.io.SocketWrapperOutputDestination.writeInt(SocketWrapperOutputDestination.java:93)
  at org.jppf.io.IOHelper.writeData(IOHelper.java:133)
  at org.jppf.server.node.remote.RemoteNodeIO.sendResults(RemoteNodeIO.java:125)
  at org.jppf.server.node.AbstractNodeIO.writeResults(AbstractNodeIO.java:157)
  at org.jppf.server.node.JPPFNode.processResults(JPPFNode.java:248)
  at org.jppf.server.node.JPPFNode.processNextJob(JPPFNode.java:197)
  at org.jppf.server.node.JPPFNode.perform(JPPFNode.java:169)
  at org.jppf.server.node.JPPFNode.run(JPPFNode.java:134)
  at org.jppf.node.NodeRunner.main(NodeRunner.java:131)
2015-01-14 08:28:41,950 [WARN ][main                ][org.jppf.classloader.AbstractClassLoaderConnection.sendCloseChannelCommand(105)]: error sending close channel command : java.io.EOFException: null
2015-01-14 08:28:43,021 [ERROR][main                ][org.jppf.server.node.JPPFNode.run(146)]: Could not load class 'org.jppf.utils.SerializationHelperImpl'
java.lang.ClassNotFoundException: Could not load class 'org.jppf.utils.SerializationHelperImpl'
  at org.jppf.classloader.AbstractJPPFClassLoader.findClass(AbstractJPPFClassLoader.java:155)
  at org.jppf.classloader.AbstractJPPFClassLoader.loadJPPFClass(AbstractJPPFClassLoader.java:93)
  at org.jppf.server.node.JPPFContainer.initHelper(JPPFContainer.java:139)
  at org.jppf.server.node.JPPFContainer.init(JPPFContainer.java:94)
  at org.jppf.server.node.JPPFContainer.<init>(JPPFContainer.java:85)
  at org.jppf.server.node.remote.JPPFRemoteContainer.<init>(JPPFRemoteContainer.java:63)
  at org.jppf.server.node.remote.RemoteClassLoaderManager.newJPPFContainer(RemoteClassLoaderManager.java:69)
  at org.jppf.server.node.AbstractClassLoaderManager.getContainer(AbstractClassLoaderManager.java:124)
  at org.jppf.server.node.AbstractCommonNode.getContainer(AbstractCommonNode.java:99)
  at org.jppf.server.node.AbstractNodeIO.postSendResults(AbstractNodeIO.java:181)
  at org.jppf.server.node.AbstractNodeIO.writeResults(AbstractNodeIO.java:159)
  at org.jppf.server.node.JPPFNode.processResults(JPPFNode.java:248)
  at org.jppf.server.node.JPPFNode.processNextJob(JPPFNode.java:205)
  at org.jppf.server.node.JPPFNode.perform(JPPFNode.java:169)
  at org.jppf.server.node.JPPFNode.run(JPPFNode.java:134)
  at org.jppf.node.NodeRunner.main(NodeRunner.java:131)
The node's conosole output also shows that it tries to reconnect repeatedly but fails each time.
Steps to reproduce this issue
1) Simulate a node disconnection from a separate thread while executing a job, using a NodeLifeCycleListener like this:
public class MyNodeListener extends NodeLifeCycleListenerAdapter {
  @Override public void jobStarting(NodeLifeCycleEvent event) {
    final NodeInternal node = (NodeInternal) event.getNode();
    Runnable r = new Runnable() {
      @Override public void run() {
        try {
          Thread.sleep(500L);
          node.getNodeConnection().close();
        } catch (Exception e) {
          e.printStackTrace();
        }
      }
    };
    new Thread(r, "MyNodeListener").start();
  }
}
2) start a driver and node, then submit a job with a task that lasts more than 500 ms ==> you see the exceptions in the node log, and reconnection messages in the console output

#3
Comment posted by
 lolo4j
Jan 14, 10:42
Fixed in: