JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
bug_report_small.png
CLOSED  Bug report JPPF-538  -  Deadlock after call of hasAvailableConnection
Posted Jun 27, 2018 - updated Jul 05, 2018
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
     
    Closed
  • Assigned to
     lolo4j
  • Progress
       
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     Igor
  • Owned by
    Not owned by anyone
  • Category
    Not determined
  • Resolution
    RESOLVED
  • Priority
    High
  • Reproducability
    Not determined
  • Severity
    Not determined
  • Targetted for
    icon_milestones.png JPPF 5.2.10
Issue description
Hi,

we get sometimes the following deadlock, when we send new tasks to our jppf server. Our code checks before the start, if there are any available connections to jppf server by JPPFClient#hasAvailableConnection. In case that there are no connections available the task is canceled.

Sometimes our application hangs in this process. jstack shows that there is a deadlock between TaskQueueChecker-thread and the thread, where hasAvailableConnection is called.
Java stack information:
===================================================
"ChannelsExecutor-0001":
	at org.jppf.client.balancer.queue.TaskQueueChecker$2.run(TaskQueueChecker.java:177)
	- waiting to lock <0x000000064db25c90> (a org.jppf.utils.collections.SetSortedMap)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
"TaskQueueChecker":
	at org.jppf.utils.ThreadSynchronization.wakeUp(ThreadSynchronization.java:70)
	- waiting to lock <0x000000064db238d8> (a org.jppf.client.balancer.JobManagerClient)
	at org.jppf.client.AbstractGenericClient.statusChanged(AbstractGenericClient.java:435)
	at org.jppf.client.AbstractJPPFClientConnection.fireStatusChanged(AbstractJPPFClientConnection.java:118)
	at org.jppf.client.AbstractJPPFClientConnection.setStatus(AbstractJPPFClientConnection.java:98)
	at org.jppf.client.JPPFClientConnectionImpl.setStatus(JPPFClientConnectionImpl.java:34)
	at org.jppf.client.AbstractJPPFClientConnection.processStatusChanged(AbstractJPPFClientConnection.java:171)
	at org.jppf.client.AbstractJPPFClientConnection.taskServerConnectionStatusChanged(AbstractJPPFClientConnection.java:158)
	at org.jppf.client.JPPFClientConnectionImpl$2.statusChanged(JPPFClientConnectionImpl.java:87)
	at org.jppf.client.AbstractClientConnectionHandler.fireStatusChanged(AbstractClientConnectionHandler.java:156)
	at org.jppf.client.AbstractClientConnectionHandler.setStatus(AbstractClientConnectionHandler.java:123)
	at org.jppf.client.balancer.ChannelWrapperRemote.setStatus(ChannelWrapperRemote.java:100)
	at org.jppf.client.balancer.ChannelWrapperRemote.submit(ChannelWrapperRemote.java:125)
	at org.jppf.client.balancer.ChannelWrapperRemote.submit(ChannelWrapperRemote.java:39)
	at org.jppf.client.balancer.queue.TaskQueueChecker.dispatchJobToChannel(TaskQueueChecker.java:301)
	- locked <0x000000064db26ab0> (a org.jppf.client.balancer.ChannelWrapperRemote)
	at org.jppf.client.balancer.queue.TaskQueueChecker.dispatch(TaskQueueChecker.java:233)
	- locked <0x000000064db25c90> (a org.jppf.utils.collections.SetSortedMap)
	at org.jppf.client.balancer.queue.TaskQueueChecker.run(TaskQueueChecker.java:201)
	at java.lang.Thread.run(Thread.java:748)
"__ejb-thread-pool16":
	at org.jppf.client.balancer.queue.TaskQueueChecker.hasIdleChannel(TaskQueueChecker.java:190)
	- waiting to lock <0x000000064db25c90> (a org.jppf.utils.collections.SetSortedMap)
	at org.jppf.client.balancer.JobManagerClient.hasAvailableConnection(JobManagerClient.java:385)
	- locked <0x000000064db238d8> (a org.jppf.client.balancer.JobManagerClient)
	at org.jppf.client.AbstractGenericClient.hasAvailableConnection(AbstractGenericClient.java:421)
	at de.eagle.platform.outbound.sendout.JPPFGridStrategy2.connect(JPPFGridStrategy2.java:62)
	- locked <0x000000064dec04b8> (a java.lang.Class for de.eagle.platform.outbound.sendout.JPPFGridStrategy2)
	at de.eagle.platform.outbound.sendout.Grid.connect(Grid.java:33)
	at de.eagle.platform.outbound.sendout.SendoutJob.execute(SendoutJob.java:69)
	at de.eagle.jobcontrol.server.JobLauncher.startJob(JobLauncher.java:57)
	at sun.reflect.GeneratedMethodAccessor141.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1081)
	at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1153)
	at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:4836)
	at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:656)
	at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:836)
	at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
	at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73)
	at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
	at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:895)
	at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:835)
	at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
	at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
	at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvoke(SystemInterceptorProxy.java:140)
	at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:895)
	at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:835)
	at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:374)
	at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:4808)
	at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:4796)
	at com.sun.ejb.containers.EjbAsyncTask.call(EjbAsyncTask.java:101)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Steps to reproduce this issue
I don't have exakt steps how to reproduce this problem but I can try to describe the situation at the moment when the problem usually occured. We use java 8 und jppf 5.2.9-client Our application uses shared static JPPFClient object, which is initialized after the start of our application. At two or three times, when the problem occured, there were two jobs, that were submitted at nearly same time. Our code looks like:

if(JPPF_CLIENT.hasAvailableConnection()){ .... for(){ JPPF_CLIENT.submitJob(job); } } This code can be called by several processes at the same time.


#2
Comment posted by
 lolo4j
Jun 27, 18:33
I reproduce systematically with the following code:
public class TestAvailableConnections {
  public static void main(final String[] args) {
    try (JPPFClient client = new JPPFClient()) {
      JPPFJob job = new JPPFJob();
      job.setBlocking(false);
      job.setName("job name");
      job.add(new MyTask());
      client.submitJob(job);
      for (int i=0; i<1000; i++) {
        client.hasAvailableConnection();
        Thread.sleep(1L);
      }
      List<Task<?>> results = job.awaitResults();
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
 
  public static class MyTask extends AbstractTask<String> {
    @Override
    public void run() {
      try {
        Thread.sleep(5000L);
      } catch (Exception e) {
        setThrowable(e);
      }
    }
  }
}
#4
Comment posted by
 lolo4j
Jun 27, 18:34
Here's the deadlock information:
- thread id 29 "JPPF Client-0008" is waiting to lock org.jppf.client.balancer.JobManagerClient@1617b538 which is held by thread id 1 "main"
- thread id 1 "main" is waiting to lock org.jppf.utils.collections.SetSortedMap@35d52352 which is held by thread id 13 "TaskQueueChecker"
- thread id 13 "TaskQueueChecker" is waiting to lock org.jppf.client.balancer.JobManagerClient@1617b538 which is held by thread id 1 "main"
 
Stack trace information for the threads listed above
 
"JPPF Client-0008" - 29 - state: BLOCKED - blocked count: 2 - blocked time: 875 - wait count: 0 - wait time: 0
  at org.jppf.utils.ThreadSynchronization.wakeUp(ThreadSynchronization.java:70)
  - waiting on org.jppf.client.balancer.JobManagerClient@1617b538
  at org.jppf.client.AbstractGenericClient.statusChanged(AbstractGenericClient.java:435)
  at org.jppf.client.AbstractJPPFClientConnection.fireStatusChanged(AbstractJPPFClientConnection.java:118)
  at org.jppf.client.AbstractJPPFClientConnection.setStatus(AbstractJPPFClientConnection.java:98)
  at org.jppf.client.JPPFClientConnectionImpl.setStatus(JPPFClientConnectionImpl.java:34)
  at org.jppf.client.ConnectionInitializer.run(ConnectionInitializer.java:55)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
 
  Locked ownable synchronizers:
  - java.util.concurrent.ThreadPoolExecutor$Worker@5d80de49
 
"main" - 1 - state: BLOCKED - blocked count: 2 - blocked time: 891 - wait count: 80 - wait time: 79
  at org.jppf.client.balancer.queue.TaskQueueChecker.hasIdleChannel(TaskQueueChecker.java:191)
  - waiting on org.jppf.utils.collections.SetSortedMap@35d52352
  at org.jppf.client.balancer.JobManagerClient.hasAvailableConnection(JobManagerClient.java:385)
  - locked org.jppf.client.balancer.JobManagerClient@1617b538
  at org.jppf.client.AbstractGenericClient.hasAvailableConnection(AbstractGenericClient.java:421)
  at test.client.deadlock.TestAvailableConnections.main(TestAvailableConnections.java:40)
 
"TaskQueueChecker" - 13 - state: BLOCKED - blocked count: 3 - blocked time: 890 - wait count: 10 - wait time: 92
  at org.jppf.utils.ThreadSynchronization.wakeUp(ThreadSynchronization.java:70)
  - waiting on org.jppf.client.balancer.JobManagerClient@1617b538
  at org.jppf.client.AbstractGenericClient.statusChanged(AbstractGenericClient.java:435)
  at org.jppf.client.AbstractJPPFClientConnection.fireStatusChanged(AbstractJPPFClientConnection.java:118)
  at org.jppf.client.AbstractJPPFClientConnection.setStatus(AbstractJPPFClientConnection.java:98)
  at org.jppf.client.JPPFClientConnectionImpl.setStatus(JPPFClientConnectionImpl.java:34)
  at org.jppf.client.AbstractJPPFClientConnection.processStatusChanged(AbstractJPPFClientConnection.java:171)
  at org.jppf.client.AbstractJPPFClientConnection.taskServerConnectionStatusChanged(AbstractJPPFClientConnection.java:158)
  at org.jppf.client.JPPFClientConnectionImpl$2.statusChanged(JPPFClientConnectionImpl.java:87)
  at org.jppf.client.AbstractClientConnectionHandler.fireStatusChanged(AbstractClientConnectionHandler.java:156)
  at org.jppf.client.AbstractClientConnectionHandler.setStatus(AbstractClientConnectionHandler.java:123)
  at org.jppf.client.balancer.ChannelWrapperRemote.setStatus(ChannelWrapperRemote.java:100)
  at org.jppf.client.balancer.ChannelWrapperRemote.submit(ChannelWrapperRemote.java:125)
  at org.jppf.client.balancer.ChannelWrapperRemote.submit(ChannelWrapperRemote.java:39)
  at org.jppf.client.balancer.queue.TaskQueueChecker.dispatchJobToChannel(TaskQueueChecker.java:302)
  - locked org.jppf.client.balancer.ChannelWrapperRemote@5cb8a368
  at org.jppf.client.balancer.queue.TaskQueueChecker.dispatch(TaskQueueChecker.java:234)
  - locked org.jppf.utils.collections.SetSortedMap@35d52352
  at org.jppf.client.balancer.queue.TaskQueueChecker.run(TaskQueueChecker.java:201)
  at java.lang.Thread.run(Thread.java:745)
 
  Locked ownable synchronizers:
  - java.util.concurrent.locks.ReentrantLock$NonfairSync@3202c3ef
#6
Comment posted by
 lolo4j
Jun 28, 07:20
Fixed in master and b_5_2 branches. Added fix to patch 01 for JPPF 5.2.9
#8
Comment posted by
 Igor
icon_reply.pngJul 05, 10:19, in reply to comment #6
Thank you for so quick patch!

lolo4j wrote:
Fixed in master and b_5_2 branches. Added fix to patch 01 for JPPF 5.2.9