JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
bug_report_small.png
CLOSED  Bug report JPPF-179  -  Deadlock in the driver
Posted Aug 09, 2013 - updated Aug 10, 2013
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
     
    Closed
  • Assigned to
    Not assigned to anyone
  • Progress
       
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     lolo4j
  • Owned by
    Not owned by anyone
  • Category
    Server
  • Resolution
    RESOLVED
  • Priority
    Normal
  • Reproducability
    Can't reproduce
  • Severity
    Normal
  • Targetted for
    icon_milestones.png JPPF 3.3.5
Issue description
The following deadlock was reported:
Found one Java-level deadlock:
 
"NodeJobServer-4":
  waiting to lock monitor 0x0000000001491310 (object 0x00000007614aabc0, a org.jppf.server.nio.client.ClientContext),
  which is held by "ClientJobServer-2"
 
"ClientJobServer-2":
  waiting to lock monitor 0x0000000000f1d550 (object 0x000000070d5fece0, a org.jppf.server.protocol.ServerTaskBundleClient),
  which is held by "NodeJobServer-4"
 
Java stack information for the threads listed above:
 
"NodeJobServer-4":
  at org.jppf.server.nio.client.ClientContext.getInitialBundleWrapper(ClientContext.java:274)
  - waiting to lock <0x00000007614aabc0> (a org.jppf.server.nio.client.ClientContext)
  at org.jppf.server.nio.client.CompletionListener.taskCompleted(CompletionListener.java:73)
  at org.jppf.server.protocol.ServerTaskBundleClient.fireTasksCompleted(ServerTaskBundleClient.java:334)
  at org.jppf.server.protocol.ServerTaskBundleClient.resultReceived(ServerTaskBundleClient.java:194)
  - locked <0x000000070d5fece0> (a org.jppf.server.protocol.ServerTaskBundleClient)
  at org.jppf.server.protocol.ServerJob.resultsReceived(ServerJob.java:227)
  at org.jppf.server.protocol.ServerTaskBundleNode.resultsReceived(ServerTaskBundleNode.java:172)
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(WaitingResultsState.java:86)
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(WaitingResultsState.java:38)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:82)
  - locked <0x00000007875e73d8> (a org.jppf.server.nio.SelectionKeyWrapper)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
 
"ClientJobServer-2":
  at org.jppf.server.protocol.ServerTaskBundleClient.cancel(ServerTaskBundleClient.java:242)
  - waiting to lock <0x000000070d5fece0> (a org.jppf.server.protocol.ServerTaskBundleClient)
  at org.jppf.server.nio.client.ClientContext.cancelJobOnClose(ClientContext.java:261)
  - locked <0x00000007614aabc0> (a org.jppf.server.nio.client.ClientContext)
  at org.jppf.server.nio.client.ClientContext.handleException(ClientContext.java:112)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:94)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)


Steps to reproduce this issue
I cannot reproduce.

#2
Comment posted by
 lolo4j
Aug 10, 15:29
In fact, this deadlock is also fixed by the fix for Bug report JPPF-171 - Deadlock in the server upon client disconnection. The giveaway is in these two lines of the stack from the thread "ClientJobServer-2":
at org.jppf.server.nio.client.ClientContext.cancelJobOnClose(ClientContext.java:261)
  - locked <0x00000007614aabc0> (a org.jppf.server.nio.client.ClientContext).
This shows that cancelOnClose() is locking the ClientContext, that's because this method was synchronized before the fix. Now it's not synchronized anymore, thus it can't be blocking the other thread

The issue was updated with the following change(s):
  • This issue has been closed
  • The status has been updated, from New to Closed.
  • This issue's progression has been updated to 100 percent completed.
  • The resolution has been updated, from Not determined to RESOLVED.