JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 
March 29, 2023, 12:10:10 AM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: Need help -- Is JPPF right solution for this situation  (Read 3560 times)

arg

  • JPPF Padawan
  • *
  • Posts: 2
Need help -- Is JPPF right solution for this situation
« on: July 05, 2012, 07:14:41 PM »

Need your help in validating that JPPF (which I believe is) is the right solution...
Here is the Context (problem and potential solution:
  • Web Application takes input in the form of a list of items (eg: from excel or XML) and then passes the data on to the system under discussion.
  • This system now Validates the lines against database, external systems and by some in-process rules
  • Some of the input data could run into tens of thousands of lines. Consequently the sequential process takes a lot of time to complete, annoying users.


The (first)solution to address the above problem was to
  • Do all common processing for the whole input.
  • Batch the input into multiple groups of lines and then process them in parallel.
Now, the resource available in a machine becomes an issue... when a large input is being processed, it will hold down the resources, impacting other (mostly smaller inputs).
That is worser than the original problem.
So, to workaround this issue, we add more machines so multiple inputs can be processed in parallel.
Any request is confined to a single machine. And, so, it is constrained by the number of parallel threads available in that single machine.
This is where we are, as of now.

However, this is still not perfect... for a large input, we are constraining all the processing to one single machine and not utilizing the other machines that could be idle.
This is where JPPF looks a good fit.

The architecture/topology that I have in mind is every machine runs a server, at least (more on this later) one node, and a client. This way, every machine is individually able to handle requests without a dependency on any other machine.
The servers in all the machines will be connected to each other.
One large input gets split into multiple tasks. The tasks are now added to one job. The job is sent to a server.
The server now splits it into multiple nodes (local or remote - thru a server).
The nodes that get assigned this job's tasks, process the tasks in parallel threads.

(Wanted to post a crude picture, that will help understand the approach. But, not familiar with including an image as part of a forum comment. Will try that later)


If I need more processing power, I just add a new machine (replica of every other machine; they are all homogeneous in terms of OS).


Now the question is ...
  • is this a valid use case to build JPPF into?
  • is this a valid topology or is it complicating things too much?
  • Can multiple nodes be run within the same machine? Any resource contention issue that I need to be worried about?


Thanks
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: Need help -- Is JPPF right solution for this situation
« Reply #1 on: July 06, 2012, 08:16:06 AM »

Hello,

Yes, this is definetely something JPPF is designed to do.
The choice of a peer to peer topology (i.e. each server connected to all others) has some implication you should be aware of:
- you may face some sub-optimal performance issues due to the topology itself, and the way tasks may be routed among the servers. For instance if oyu have n servers S1, ..., Sn each with its node N1, ..., Nn, some tasks could be routed thorugh the path S1, S2, .. Sn to finally be executed on node Nn. This is where you get the maximum network overhead in such a topology. To address this you will need to fine-tune the the load-balancing on the servers. Fortunately, JPPF is very flexible in that area.
- the great advantage of a P2P topology is the level of redundancy and failover capabilities it provides. If server S1 crashes for any reason, then the client C1 will be able to connect to any (or all) of the other servers, and ensure a continuity of service even in degraded mode.
- if you intend to use a pure P2P topology with a single node on each machine, I would recommend to configure each server with its own local (in-JVM) node, as this will avoid the communication overhead between the server and asosicated node.
- you will also find practical information on this in this entry of the JPPF blog: http://www.jroller.com/jppf/entry/master_worker_or_p2p_grid

Sincerely,
-Laurent
Logged

arg

  • JPPF Padawan
  • *
  • Posts: 2
Re: Need help -- Is JPPF right solution for this situation
« Reply #2 on: July 09, 2012, 11:02:17 AM »

Thanks Laurent. That helps.

I do understand the network overheads of a absolutely pure P2P. What I will probably settle for is a middle ground. Each server connected to a preset number of other servers, with a higher priority for nodes connected to the current server than to another server. Not that it will completely alleviate the network overheads. I will work on this further and let you know if I run into any issues. Can a running server adjust the list of servers that it is connected to based on what servers are available at any point in time (without having to restart the server)?

Also, I will have multiple nodes per machine, to work around the restriction of one job per node. I guess I could run one of these as a local node and the rest as remote nodes.

Thanks, Arul
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads