JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 
June 03, 2023, 10:52:51 AM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: driver and node configuration for jobs that spawn their own jobs  (Read 1917 times)

diverbeard

  • JPPF Padawan
  • *
  • Posts: 3
driver and node configuration for jobs that spawn their own jobs
« on: February 13, 2015, 10:59:22 PM »

Hello all,

I was wondering if you give me some advice.

I have a need to run many parallel but independent applications at once ... ok ... JPPF ... i'm in the right place ... BUT ... this application is and old DELPHI application and it does things in inefficient ways ... writes a lot of data into ascii temp ... rereads that data ... I assume because RAM issues in the early 90's ...

I have a JPPF cluster on a set of servers that all read/write data back to a NAS.  All these large files passing across the network between server and NAS is slowing things down.  I'd like to take advantage of high speed local storage (or maybe a RAM drive) to speed up the process.  To do this, I need to run the analysis on a single server, rather than split things up over all my servers so that the tasks can write back to the same location in the local drive.  I'll Robocopy the results files to the NAS when I'm finished and destroy the transient data ... I think this will speed things up greatly.

Now.  how to do that?

it would be great if each server was a node that had slave nodes that only ran processes executed by the master and only on that server ... but ... I don't think that's how slave nodes work in JPPF ... I think the slave nodes work like any other and would accept queued tasks from the driver, which isn't what I want.

I could set up each server as it's own cluster, with it's own driver, but then I would have to write something to manage how jobs are distributed to the various drivers ... which I can do if I need to ... but it seems like this is something that someone would have needed before and JPPF could be configured out-of-the-box to just do it. I just don't see how from the documentation.

Can you give me some hints?

Jason

Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: driver and node configuration for jobs that spawn their own jobs
« Reply #1 on: February 15, 2015, 10:30:09 AM »

Hi Jason,

It is my understanding that what you want is to execute all the tasks in each job on the same machine.There are multiple ways to do this, and they all involve the node filtering ability provided by an execution policy, which can be set onto a job SLA.

The simplest approach is to filter the nodes by their IP address or host name. For this, you need to collect the addresses of the available nodes, which can be done using the management APIs as follows:

Code: [Select]
public static List<String> getAllNodesHosts(JPPFClient client) throws Exception {
  // get a JMX connection to the driver
  JMXDriverConnectionWrapper jmx = getJMXConnection(client);
  // request information on all the nodes
  Collection<JPPFManagementInfo> nodes = jmx.nodesInformation();
  Set<String> result = new HashSet<>();
  // collect all the nodes host names / ip addresses
  for (JPPFManagementInfo nodeInfo: nodes) {
    String host = nodeInfo.getHost();
    if (!result.contains(host)) result.add(host);
  }
  return new ArrayList<>(result);
}

public static JMXDriverConnectionWrapper getJMXConnection(JPPFClient client) throws Exception {
  JPPFConnectionPool pool = null;
  // wait until there is a connection pool available
  while ((pool = client.getConnectionPool()) == null) Thread.sleep(10L);
  JMXDriverConnectionWrapper jmx = null;
  // wait until the pool has an established JMX connection
  while ((jmx = pool.getJmxConnection(true)) == null) Thread.sleep(10L);
  return jmx;
}

Once you have the list of node hosts, you can use them in a round-robin fashion for all your jobs, for example:

Code: [Select]
try (JPPFClient client = new JPPFClient()) {
  List<JPPFJob> jobs = ...;
  List<String> nodeHosts = getAllNodesHosts(client);
  int hostIndex = 0;
  for (JPPFJob job: jobs) {
    // modulo operation to implement round-robin rotation
    hostIndex = hostIndex % nodeHosts.size();
    String host = nodeHosts.get(hostIndex);
    // execute only on the machine which has 'host' as IPv4 or IPv6 address or host name
    ExecutionPolicy policy = new Contains("ipv4.addresses", true, host).or(new Contains("ipv6.addresses", true, host));
    job.getSLA().setExecutionPolicy(policy);
    client.submitJob(job);
  }
} catch (Exception e) {
  e.printStackTrace();
}

With this approach, you effectively implement a partitioning of the grid without the need to have multiple servers, which makes life a lot simpler. This also allows you to use node provisioning without further concern, since each slave node will have the same ip address as its master node.

Quote
I don't think that's how slave nodes work in JPPF ...
That is totally correct, slave nodes are pretty much standard nodes, independent from the master from a task distribution persipective, with the difference that the master node controls the JVM processes of all its slaves: if the master dies, all slaves are terminated.

I hope this answers your questions.

Sincerely,
-Laurent
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads