....1)...- using JPPF tasks to start external programs.This approach has the advantages of providing a very loose coupling between the Java and non-Java parts, allowing you to keep your external code doing what it's already doing, while the Java wrapper focuses on providing the required inputs and getting the resulting outputs. I definitely recommend that approach, as it is, in my opinion, the one that requires the least amount of integration and maintenance.We have actually started working on this approach, and some APIs are already available in the current distribution. They provide a way to start an external program or shell script from a JPPF task, and specify a set of input files or urls as well as output files or urls.It's not completely tested yet, and I don't think it covers all uses cases, however it is something that you can use and build upon.Currently, the only documentation available for this is in the javadoc, you can find it at this location.I would be happy to get your feedback on it, and implement the features you believe are missing or incomplete.
2) From a high-level view, I see 2 main challenges:- Integration with the cluster architecture.You will probably have to develop a JPPF client that is capable to communicate with the resource manager and transform a job into a set of JPPF tasks. I recommend that JPPF server and nodes be running all the time, as opposed to started on-demand, so as not to infringe too much on the job's allocated time....
3) Network restrictions are the biggest pain point when using JPPF. JPPF communications use a custom protocol built directly on top of TCP/IP sockets (no SSL). A JPPF network requires 3 distinct TCP ports, there's no way around that in the current implementation. It may be possible to work around this limitation through SSH tunneling, or some other kind of tunneling, however I'm no expert on this topic and I would defer to a network specialist.
Can JPPF deal with situations where there is a shared filesystem and cases where there isn't with respect to input/output/data files?
how do I start up and take down the JPPF on demand infrastructure for a specific compute cluster's scheduler.
Yes this may be an issue for the ideal situation(where I could potentially run a single JPPF infrastructure using compute resources from multiple locations simultaneously) I was hoping there was a way to multiplex things across 443. If JPPF used a single port then SSH Tunneling could be the way, but I'm not sure if there is a way to funnel the use of 3 ports through one. Is the 3 ports requirement on the Server/Client/Both?
Hello Ron,....I hope this helps,-Laurent
Hello Ron,Welcome to the JPPF forums.I have a few answers and guidelines that I hope you will find useful:1) to run non-java tasks, I see 2 possible approaches:- using a JNI wrapper.This would probably work with C++ or C programs, however I am not sure how easy it is to do with Perl programs. Also, JNI-based implementations are generally tightly coupled with the external programs and pose difficult challenges in terms of maintenance.- using JPPF tasks to start external programs.This approach has the advantages of providing a very loose coupling between the Java and non-Java parts, allowing you to keep your external code doing what it's already doing, while the Java wrapper focuses on providing the required inputs and getting the resulting outputs. I definitely recommend that approach, as it is, in my opinion, the one that requires the least amount of integration and maintenance.We have actually started working on this approach, and some APIs are already available in the current distribution. They provide a way to start an external program or shell script from a JPPF task, and specify a set of input files or urls as well as output files or urls.It's not completely tested yet, and I don't think it covers all uses cases, however it is something that you can use and build upon.Currently, the only documentation available for this is in the javadoc, you can find it at this location.I would be happy to get your feedback on it, and implement the features you believe are missing or incomplete.2) From a high-level view, I see 2 main challenges:- Integration with the cluster architecture.You will probably have to develop a JPPF client that is capable to communicate with the resource manager and transform a job into a set of JPPF tasks. I recommend that JPPF server and nodes be running all the time, as opposed to started on-demand, so as not to infringe too much on the job's allocated time.- Scheduling constraintsHere I'm thinking mostly about time constraints, i.e. how long a job is allowed to run. I believe JPPF can accommodate this by specifying a timeout on the tasks. There is a feature that enables tasks to timeout after a specific length of time or at a specified date/time. This is documented hereWith these points in mind, I believe your idea will work.3) Network restrictions are the biggest pain point when using JPPF. JPPF communications use a custom protocol built directly on top of TCP/IP sockets (no SSL). A JPPF network requires 3 distinct TCP ports, there's no way around that in the current implementation. It may be possible to work around this limitation through SSH tunneling, or some other kind of tunneling, however I'm no expert on this topic and I would defer to a network specialist.I hope this helps,-Laurent