JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net

The open source
grid computing

 Home   About   Features   Download   Documentation   On Github   Forums 
March 28, 2023, 10:56:39 PM *
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
Home Help Search Login Register
Pages: [1]   Go Down

Author Topic: Architecture suggestions for 5000 nodes  (Read 3806 times)


  • JPPF Knight
  • **
  • Posts: 17
Architecture suggestions for 5000 nodes
« on: October 31, 2011, 05:45:37 PM »

Hello everybody,

First of all, I´d like to thank JPPF team for this awesome project. Currently we have JPPF running in 200 nodes with only one driver (5gb memory), and it is running really well. It allowed us to reduce a processing of 7 hours to 7 minutes.

Now we have plans to increase our grid to 5000 nodes, so we need your insight about it. Can JPPF handle it? Can a single driver handle all nodes? If yes, what kind a machine is required for that? If no, what kind of driver hierarchy do you recommend?

I know that this is a huge grid, so please don't hesitate to answer that you don't know if you don't, than I will go ahead with a POC to check that and will let you know.



  • JPPF Knight
  • **
  • Posts: 17
Re: Architecture suggestions for 5000 nodes
« Reply #1 on: November 10, 2011, 01:05:59 PM »

Hey guys,

Please share your thoughts on this. I need to have at least some feeling about it before allow my project to alocate this huge number of nodes in the grid.



  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: Architecture suggestions for 5000 nodes
« Reply #2 on: November 14, 2011, 01:51:36 PM »

Hello Bruno,

I do not have the experience with such a large grid, but I can provide some pointers.

- Theoretically, a single JPPF driver should be able to handle that many nodes, however you might need to consider that for  each node, there will be up to 3 network connections (TCP socket endpoints) : one for the job channel, one for the class loader channel and one for the JMX-based management. So with 5,000 nodes, you will probably hit the OS limit on file descriptors. If I remember well, the maximum number of  opened file descriptor is by default 1024 per process on Linux. You will thus need to increase that value to at least 15,000. See this article for more details.

- We just released JPPF 2.5.4, which fixes an issue impacting the scalability of the server in high-load conditions. We recommend that you upgrade to this version.

- In terms of machine sizing, it's always difficult to answer this question. Obviously, the more cores are available, the better. You will probably need to do some tuning. I especially recommend to monitor the CPU load with regards to the number of nodes. If the load reaches the 90% - 100% range, you might consider using more cores, or using multiple drivers with a smaller number of attached nodes.

- If using a topology with multiple drivers, consider that a node can only be attached to a single driver at a time. Also, I recommend not to have the drivers connected to each other, because the driver-to-driver communications don't scale very well - we have plans to address this, but it won't be any time soon. This implies that you will in fact have several separate grids, all accessible from the same client. So, depending on what kind of payload you submit to the grid, you may also consider splitting your jobs, to spread them over multiple drivers, especially if the jobs have a large number of tasks.

- Lastly, the most important configuration parameters you will eventually need to tune are the load balancing and the size of the pool of parallel I/O threads.

I hope this helps.


Pages: [1]   Go Up
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at Fast, secure and Free Open Source software downloads