As such, the compute nodes in the the subject cluster can send requests to external platforms (outside the cluster) via TCP, but they are restricted from receiving TCP messages initiated from outside the cluster.Is this a show stopper (does this prevent my use of JPPF on such a linux cluster)?To rephrase, I'd like to run my JPPF-server process on my local server platform, which would in turn submit PBS-job requests via ssh to the PBS front end platform, where each such PBS-job consists of starting a JPPF-node-process (on a compute node determined by PBS), requesting from the external JPPF-server and running JPPF-tasks, and then terminating the JPPF-node-process.
'm not sure which type of node you mean, JPPF-server or JPPF-node
I'm assuming that I would not need a JPPF-node
All you need is a hook in the JPPF server to execute the code for sending the commands via SSH.
If you want to execute a JPPF job, you need at least one JPPF node to execute it, that's how JPPF works.
1) the listening ProcessBuilder sends an ssh-command to the cluster front-end in order to start a JPPF-node on the cluster front-end itself,2) the JPPF-server (soon discovers the JPPF-node now running on the cluster front-end and) sends the job (tasks) to the JPPF-node, and3) after the job completes, the still listening ProcessBuilder sends an ssh-command to the cluster front-end in order to kill the remote JPPF-node......, true?
If I may suggest an alternative with a slight modification in order to somewhat reduce resource consumption on the cluster's front-end node, the JPPF-server's ProcessBuilder could instead start the JPPF-node on the local machine rather than on the cluster front-end node, and kill the now local JPPF-node after the job completes. Meanwhile, the local JPPF-node would submit PBS jobs (via ssh <cluster>; qsub...) directly to the OS on the cluster front-end as well as monitor job progress (via ssh <cluster>; qstat...).
If you don't need multiple JPPF-nodes to submit multiple PBS-jobs, one option is to use a single JPPF-client with local execution enabled, remote execution disabled, and no JPPF-server or JPPF-node...
nodes can only process one job at a time; however they can execute multiple tasks in parallel
(Perhaps the online documentation needs some updates as I see references, for example, to missing contents or missing sections (2.4.1) and chapter (6). Also, when I use the online search field, I'm unable to find, for example, any of the known, existing occurrences of "SLA".)
... though not without the additional (though perhaps marginal) coding complexity of implementing the corresponding two additional types of JPPF-nodes.
Could you let us know where you found these dead references? We should be able to fix those quickly.
Regarding the local execution configuration, you are right, there is no monitoring occurring in this context.
Actually, there is still something that is not clear to me: what would be running on the PBS cluster? would it be1) a PBS-specific kind of job (not using the JPPF APIs)?2) or the same kind of JPPF-job as for local jobs, in a JPPF-node that would be started in the PBS cluster for that purpose?
My take on this is that, since this information would be known at the time the JPPF-job is submitted, if would be easier to start the JPPF-node, or the script that starts the node for remote nodes, from within the client itself. This is definitely less complex than listening to JPPF-job events in the JPPF-server: start the JPPF-node before submitting the JPPF-job, kill it when the job is complete.
In order to manage the node from within the client, I'm not sure how to avoid blocking the client without using a listener-in-the-client, such as in the following, yes?