JPPF-157 - Improper handling of serialized tasks with a size > 2GB
Posted Jun 19, 2013 - updated Dec 27, 2014
Issue description
When a node is returning, after execution, a task whose size is larger than 2GB, the following exception is raised in the driver upon receiving this task:
  at org.jppf.server.nio.AbstractNioMessage.readNextObject(
  at org.jppf.server.nio.nodeserver.AbstractNodeContext.readMessage(
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(
  at org.jppf.server.nio.nodeserver.WaitingResultsState.performTransition(
  at java.util.concurrent.Executors$
  at java.util.concurrent.FutureTask$Sync.innerRun(
  at java.util.concurrent.ThreadPoolExecutor.runWorker(
  at java.util.concurrent.ThreadPoolExecutor$

This is due to a limitation in the communication protocol between node and server. Each message between node and server is made of serialized objects, where each serialized object is preceded by an integer that determines its size. When the size is > 2GB, there is an integer overflow which can cause the node to write, and the server to raed, a negative size.

I also see that this causes the server to disconnect form the node, and resubmiit the task to the same or another node, thus the job never terminates.

The same kind of issue would occur between client and server and needs to be addressed as well.
Steps to reproduce this issue
- configure a node with 4 GB of heap (-Xmx4g)

- submit a job with the following task:
public class SerializationOverflowTask extends JPPFTask {
  private List<byte[]> data = null;
  public void run() {
    int size = 256 * 1024 * 1024; // 256 MB
    data = new ArrayList<byte[]>();
    for (int i=0; i<10; i++) data.add(new byte[size]);

==> you get the NegativeArraySizeException in the driver's log

Comment posted by
Jun 23, 09:57
Fixed. Changes committed to SVN:

