JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
bug_report_small.png
CLOSED  Bug report JPPF-171  -  Deadlock in the server upon client disconnection
Posted Jul 19, 2013 - updated Jul 27, 2013
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
     
    Closed
  • Assigned to
     lolo4j
  • Progress
       
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     lolo4j
  • Owned by
    Not owned by anyone
  • Category
    Server
  • Resolution
    RESOLVED
  • Priority
    Normal
  • Reproducability
    Can't reproduce
  • Severity
    Normal
  • Targetted for
    icon_milestones.png JPPF 3.3.5
Issue description
The following deadlock in the driver was reported:
Found one Java-level deadlock:
=============================
 
"NodeJobServer-4":
  waiting to lock monitor 0x000000000206fef8 (object 0x0000000771e106d0, a org.jppf.server.protocol.ServerTaskBundleClient),
  which is held by "NodeJobServer-2"
 
"NodeJobServer-2":
  waiting to lock monitor 0x00000000018a6538 (object 0x0000000737230518, a org.jppf.server.nio.client.ClientContext),
  which is held by "ClientJobServer-4"
 
"ClientJobServer-4":
  waiting to lock monitor 0x000000000206fef8 (object 0x0000000771e106d0, a org.jppf.server.protocol.ServerTaskBundleClient),
  which is held by "NodeJobServer-2"
 
Java stack information for the threads listed above:
===================================================
 
"NodeJobServer-4":
  at org.jppf.server.protocol.ServerTaskBundleClient.resultReceived(ServerTaskBundleClient.java:179)
  - waiting to lock <0x0000000771e106d0> (a org.jppf.server.protocol.ServerTaskBundleClient)
  at org.jppf.server.protocol.ServerJob.resultsReceived(ServerJob.java:227)
  at org.jppf.server.protocol.ServerTaskBundleNode.resultsReceived(ServerTaskBundleNode.java:172)
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(WaitingResultsState.java:86)
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(WaitingResultsState.java:38)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:82)
  - locked <0x000000071554fff0> (a org.jppf.server.nio.SelectionKeyWrapper)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
 
"NodeJobServer-2":
  at org.jppf.server.nio.client.ClientContext.getInitialBundleWrapper(ClientContext.java:274)
  - waiting to lock <0x0000000737230518> (a org.jppf.server.nio.client.ClientContext)
  at org.jppf.server.nio.client.CompletionListener.taskCompleted(CompletionListener.java:73)
  at org.jppf.server.protocol.ServerTaskBundleClient.fireTasksCompleted(ServerTaskBundleClient.java:334)
  at org.jppf.server.protocol.ServerTaskBundleClient.resultReceived(ServerTaskBundleClient.java:194)
  - locked <0x0000000771e106d0> (a org.jppf.server.protocol.ServerTaskBundleClient)
  at org.jppf.server.protocol.ServerJob.resultsReceived(ServerJob.java:227)
  at org.jppf.server.protocol.ServerTaskBundleNode.resultsReceived(ServerTaskBundleNode.java:172)
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(WaitingResultsState.java:86)
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(WaitingResultsState.java:38)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:82)
  - locked <0x000000070a56f348> (a org.jppf.server.nio.SelectionKeyWrapper)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
 
"ClientJobServer-4":
  at org.jppf.server.protocol.ServerTaskBundleClient.cancel(ServerTaskBundleClient.java:244)
  - waiting to lock <0x0000000771e106d0> (a org.jppf.server.protocol.ServerTaskBundleClient)
  at org.jppf.server.nio.client.ClientContext.cancelJobOnClose(ClientContext.java:261)
  - locked <0x0000000737230518> (a org.jppf.server.nio.client.ClientContext)
  at org.jppf.server.nio.client.ClientContext.handleException(ClientContext.java:112)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:94)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
From the stack traces, one thread is cancelling a job due to client disconnection/close, while two other threads are reporting results from 2 nodes.

Steps to reproduce this issue
I am unable to reproduce yet.

#4
Comment posted by
 lolo4j
Jul 27, 07:22
Fixed. Changes committed to SVN:

The issue was updated with the following change(s):
  • This issue has been closed
  • The status has been updated, from New to Closed.
  • This issue's progression has been updated to 100 percent completed.
  • The resolution has been updated, from Not determined to RESOLVED.
  • Information about the user working on this issue has been changed, from lolo4j to Not being worked on.