JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 
May 30, 2023, 07:29:17 AM *
Welcome,
Please login or register.

Login with username, password and session length
Advanced search  
News: New users, please read this message. Thank you!
  Home Help Search Login Register  
Pages: [1]   Go Down

Author Topic: portation of a win prototype to a linux cluster  (Read 7230 times)

cm

  • JPPF Master
  • ***
  • Posts: 38
portation of a win prototype to a linux cluster
« on: April 08, 2014, 06:25:17 PM »

Hi Laurent,

I have a prototype of a JPPF application that runs on my windows laptop fine. Client, Server and Node are running on the same Laptop. In this prototype I use a classpath method to send the required archives of the task, together with the task to the node. your recommend this in http://www.jppf.org/forums/index.php/topic,5333.msg9090.html#msg9090 .

In my distributed version, the client runs on my win Laptop and the server and node on different linux server.
When I start the client, I get JobEvents that say that the task is broken in the node.

java.lang.ClassNotFoundException: Could not load class 'model.Duration_Test.Model_Model'
   at org.jppf.classloader.AbstractJPPFClassLoader.findClass(AbstractJPPFClassLoader.java:123)
   at org.jppf.classloader.AbstractJPPFClassLoader.findClass(AbstractJPPFClassLoader.java:109)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
   at org.jppf.classloader.AbstractJPPFClassLoader.loadClass(AbstractJPPFClassLoader.java:337)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
   at epc_simulator.jppf.client.EpcSimulatorTask.run(EpcSimulatorTask.java:66)
   at org.jppf.server.node.NodeTaskWrapper.run(NodeTaskWrapper.java:145)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)

The missing class is part of a archive that is set in the classpath.

In the node config jppf.discovery.enable = true is set.
When I try to set jppf.discovery.enable = false
my client get no answer from the server.

I have add the configuration files. Have you an idea what the error is.

Thanks
christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #1 on: April 09, 2014, 07:25:44 AM »

Hi Christian,

Clearly, it seems the jar archive that contains the missing class has not been added to class loader's classpath (normally via AbstractJPPFClassLoader.addURL() calls). Can you check it out, by printing the result of the class loader's toString() method?

Thanks,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #2 on: April 09, 2014, 07:47:56 PM »

Hi Laurent,

I have checked it. I have send the cl.tostring() as result to the client.

When driver and node are running on localhost all is fine and I get:
model.Duration_Test.Model_Model   runSimulation    Duration_Test.xml
   Daten-2013-11.jar
   desmoj-2.3.5-complete-bin.jar
   Duration_Test.jar
   EpcSimTools_20140115.jar

client process id: 3400
[client: driver1 - ClassServer] Attempting connection to the class server at localhost:11111
[client: driver1 - ClassServer] Reconnected to the class server
[client: driver1 - TasksServer] Attempting connection to the JPPF task server at localhost:11111
[client: driver1 - TasksServer] Reconnected to the JPPF task server
Duration_Test  task_0: finished
JPPFClassLoader[id=27, type=client, uuidPath=[1AE55F6D-42B3-02CD-922E-62B7368F22D3, A1EE504B-F14C-9FE4-0BCD-F83BD4AE0F3F], offline=false, classpath=file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Duration_Test.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/EpcSimTools_20140115.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/desmoj-2.3.5-complete-bin.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Daten-2013-11.jar]
Duration_Test  task_1: finished
JPPFClassLoader[id=28, type=client, uuidPath=[1AE55F6D-42B3-02CD-922E-62B7368F22D3, A1EE504B-F14C-9FE4-0BCD-F83BD4AE0F3F], offline=false, classpath=file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Duration_Test.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/EpcSimTools_20140115.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/desmoj-2.3.5-complete-bin.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Daten-2013-11.jar]
Duration_Test  task_2: finished
JPPFClassLoader[id=29, type=client, uuidPath=[1AE55F6D-42B3-02CD-922E-62B7368F22D3, A1EE504B-F14C-9FE4-0BCD-F83BD4AE0F3F], offline=false, classpath=file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Duration_Test.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/EpcSimTools_20140115.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/desmoj-2.3.5-complete-bin.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Daten-2013-11.jar]
Duration_Test  task_3: finished
JPPFClassLoader[id=30, type=client, uuidPath=[1AE55F6D-42B3-02CD-922E-62B7368F22D3, A1EE504B-F14C-9FE4-0BCD-F83BD4AE0F3F], offline=false, classpath=file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Duration_Test.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/EpcSimTools_20140115.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/desmoj-2.3.5-complete-bin.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Daten-2013-11.jar]
Duration_Test  task_4: finished
JPPFClassLoader[id=31, type=client, uuidPath=[1AE55F6D-42B3-02CD-922E-62B7368F22D3, A1EE504B-F14C-9FE4-0BCD-F83BD4AE0F3F], offline=false, classpath=file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Duration_Test.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/EpcSimTools_20140115.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/desmoj-2.3.5-complete-bin.jar;file:/D:/eclipseWorkspaces/jppf_0/EpcSimulator_2/_simulation/Daten-2013-11.jar]
Modeldir exist: true
fertig


And when I use a driver and node in my linux cluster I get:
model.Duration_Test.Model_Model   runSimulation    Duration_Test.xml
   Daten-2013-11.jar
   desmoj-2.3.5-complete-bin.jar
   Duration_Test.jar
   EpcSimTools_20140115.jar

client process id: 1344
[client: driver1 - ClassServer] Attempting connection to the class server at bs21.wi-bw.tfh-wildau.de:11111
[client: driver1 - ClassServer] Reconnected to the class server
[client: driver1 - TasksServer] Attempting connection to the JPPF task server at 194.95.44.196:11111
[client: driver1 - TasksServer] Reconnected to the JPPF task server
Duration_Test  task_4: broken
JPPFClassLoader[id=2, type=client, uuidPath=[9F69CC02-B837-EFC7-E6C0-858D4E227697, 2DCDA77E-02DE-6300-4E1D-C389F47A6216], offline=false, classpath=]
Duration_Test  task_3: broken
JPPFClassLoader[id=2, type=client, uuidPath=[9F69CC02-B837-EFC7-E6C0-858D4E227697, 2DCDA77E-02DE-6300-4E1D-C389F47A6216], offline=false, classpath=]
Duration_Test  task_0: broken
JPPFClassLoader[id=2, type=client, uuidPath=[9F69CC02-B837-EFC7-E6C0-858D4E227697, 2DCDA77E-02DE-6300-4E1D-C389F47A6216], offline=false, classpath=]
Duration_Test  task_1: broken
JPPFClassLoader[id=2, type=client, uuidPath=[9F69CC02-B837-EFC7-E6C0-858D4E227697, 2DCDA77E-02DE-6300-4E1D-C389F47A6216], offline=false, classpath=]
Duration_Test  task_2: broken
JPPFClassLoader[id=2, type=client, uuidPath=[9F69CC02-B837-EFC7-E6C0-858D4E227697, 2DCDA77E-02DE-6300-4E1D-C389F47A6216], offline=false, classpath=]
Modeldir exist: true
fertig

You are right, in the last case the classpath is not transfered from the client to driver and node.
I think it is a configuration error.
My configuration files are in the attachment of my last note from yesterday.

Thanks
christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #3 on: April 09, 2014, 08:04:28 PM »

Hello Christian,

I checked your configuration files, and I don't think the problem is there. I think the problem would be in the code which sets the class loader's classpath. Could you post this code? In particular, how are you transporting the jar files, copying them into the linux node's file system and adding them to the class loader's classpath?

Thanks for your time.

Sincerely,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #4 on: April 09, 2014, 10:09:04 PM »

Hi Laurent,

the code is in the attachment.
From the epc_simulator.jppf.node and MATA_INF packages I build a jar archive, that is added to the lib directory of the driver.

In the client the archives are added to the classpath of the job.
      ClassPath classpath = job.getSLA().getClassPath();
      for(String archiv : this.archives){
         File jar                = new File(modelDir, archiv);
         Location<String> location    = new FileLocation(jar);
         classpath.add(archiv, location);
      }
      classpath.setForceClassLoaderReset(true);

With the node handler MyJobClassPathHandler the archives are added to the classloader.

Thanks
Christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #5 on: April 10, 2014, 07:54:16 AM »

Hi Christian,

Thanks for the code sniippet. This is what I was suspecting: the classpath you are setting in the job SLA is only transporting the location (i.e. the path) of the jar files, but not their content. Since this location doesn't exist on the node's file system, these jars willl not be put in the classpath. Futhermore, their location is potentially expressed as a Windows path, which will not work on a Linux node. Instead, what you need to do is copy the file content in memory as the source location and define the destination location on the Linux file system. This can be done as follows:

Code: [Select]
ClassPath classpath = job.getSLA().getClassPath();
for (String archiv : this.archives) {
  File jar = new File(modelDir, archiv);
  Location<String> location = new FileLocation(jar);
  // copy it in memory
  Location<byte[]> memLocation = location.copyTo(new MemoryLocation(jar.length()));
  // set the destination file on the node file system, for instance in "tmplib" under the node's root dir
  File destJar = new File("./tmplib/" + archiv);
  Location<String> destination = new FileLocation(destJar);
  classpath.add(archiv, memLocation, destination);
}

With this, the classpath handler of the node will know the content of the jars and it will also know where to put them on the file system.

Sincerely,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #6 on: April 10, 2014, 04:52:42 PM »

Hi Laurent,

thanks for your answer. I have changed the classpath seting in the client to:
      ClassPath classpath = job.getSLA().getClassPath();
      for(String archiv : this.archives){
         File jar                = new File(modelDir, archiv);
         Location<String> location    = new FileLocation(jar);
         // copy it in memory
         Location<byte[]> memLocation = location.copyTo(new MemoryLocation((int) jar.length()));
         // set the destination file on the node file system, for instance in "tmplib" under the node's root dir
         File destJar = new File("/tmplib/" + archiv);
         Location<String> destination = new FileLocation(destJar);
         classpath.add(archiv, memLocation, destination);
      }
      classpath.setForceClassLoaderReset(true);

Also I add to the root of the driver and node server a directory with name tmplib with 777 right-mode.

When I start the client I get after some minutes waiting time:
client process id: 3568
[client: driver1 - ClassServer] Attempting connection to the class server at bs21.wi-bw.tfh-wildau.de:11111
[client: driver1 - ClassServer] Reconnected to the class server
[client: driver1 - TasksServer] Attempting connection to the JPPF task server at 194.95.44.196:11111
[client: driver1 - TasksServer] Reconnected to the JPPF task server
Duration_Test  task_3: broken
JPPFClassLoader[id=2, type=client, uuidPath=[B49ACF03-E788-E70B-AFEF-513ECC070350, 2497EE3D-E629-B5E6-09F1-83EF2A4EC13F], offline=false, classpath=]
Duration_Test  task_2: broken
JPPFClassLoader[id=2, type=client, uuidPath=[B49ACF03-E788-E70B-AFEF-513ECC070350, 2497EE3D-E629-B5E6-09F1-83EF2A4EC13F], offline=false, classpath=]
Duration_Test  task_1: broken
JPPFClassLoader[id=2, type=client, uuidPath=[B49ACF03-E788-E70B-AFEF-513ECC070350, 2497EE3D-E629-B5E6-09F1-83EF2A4EC13F], offline=false, classpath=]
Duration_Test  task_0: broken
JPPFClassLoader[id=2, type=client, uuidPath=[B49ACF03-E788-E70B-AFEF-513ECC070350, 2497EE3D-E629-B5E6-09F1-83EF2A4EC13F], offline=false, classpath=]
Duration_Test  task_4: broken
JPPFClassLoader[id=2, type=client, uuidPath=[B49ACF03-E788-E70B-AFEF-513ECC070350, 2497EE3D-E629-B5E6-09F1-83EF2A4EC13F], offline=false, classpath=]
Modeldir exist: false
fertig

The content of the classpath is not transfered. I have the same situation when I use the driver and node on localhost.
Are there special requirements about the archives in the classpath? My archives are not signed. Is this required? In ClassPathElement is a method validate() that check this.

Thanks

christian
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #7 on: April 10, 2014, 09:25:55 PM »

Hi Laurent,

I have checked on the localhost:
         File destJar = new File("C:/Users/Christian/Downloads/JPPF-4.0.1-node/tmplib/"+archiv);
         Location<String> destination = new FileLocation(destJar);
         classpath.add(archiv, memLocation, destination);
The jars are copied into "C:/Users/Christian/Downloads/JPPF-4.0.1-node/tmplib/" and all is running.

and in the linux cluster:
         File destJar = new File("/tmplib/" + archiv);
         Location<String> destination = new FileLocation(destJar);
         classpath.add(archiv, memLocation, destination);
Nothing is copied into "/tmplib/" nor on the server or node. The directory "/tmplib/" has the mode 777. Where are the archives expected on the server or node?
The client runs several minutes, I think he waits on a timeout.

christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #8 on: April 11, 2014, 02:46:13 AM »

Hi Christian,

The jar files are expected to be copied in the nodes '/tmplib' directory. They will not be copied onto the server, which doesn't need them anyway.
Did you check the node's log file for any warnings or errors?

To better understand what is going on, I have added a bunch of logging statements into your MyJobClassPathHandler, in the source attached to this post. These will log at info level, so there should be no need to change the Log4j configuration. Would you mind using this new version and post the resulting node log?

Thanks a lot,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #9 on: April 11, 2014, 10:06:10 AM »

Hi Laurent,

I tested your modified Listener on localhost and in the linux cluster. The logfile from the node in the linux cluster is in the attachment.
In both cases I put the new jar with the modified handler in drivers lib directory.

On localhost the log include a lot of additional informations, but in the linux case I can't see this additional infos.

christian
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #10 on: April 11, 2014, 10:41:11 AM »

Hi Laurent,

on my linux cluster the node log is created by starting the node and afterwords not changed.
Is the listener realy started on the node?
I have copied the jar with the listener into nodes lib dir, but nothing was changed.

Christian
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #11 on: April 11, 2014, 12:42:26 PM »

Hi Laurent,

on the MyJobClassPathHandler I add a Constructor
   public MyJobClassPathHandler(){
      super();
      log.info("MyJobClassPathHandler installed");
   }

and when I start the node on localhost I get in my logfile as last line
2014-04-11 12:28:28,595 [INFO ][epc_simulator.jppf.node.MyJobClassPathHandler.<init>(35)]: MyJobClassPathHandler installed
so I see, the MyJobClassPathHandler is started.

When I do the same on the unix cluster, this line in the logfile is missing.
so I think the MyJobClassPathHandler is not started. This may be reason that the classpath is empty.

The question why he is not started remains.

Christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #12 on: April 11, 2014, 12:58:44 PM »

Hi Christian,

Great, I was going to ask you to add a constructor :)
Another thing you can do to have more information on whether the MyJobClassPathHandler is found and loaded is to add the following lines to the log4-node.properties:

Code: [Select]
log4j.logger.org.jppf.node.event.LifeCycleEventHandler=DEBUG
log4j.logger.org.jppf.utils.ServiceFinder=DEBUG

The first line should cause the node to print a debug statement like this: "successfully added node life cycle listener MyJobClassPathHandler"
The second line will only print a stack trace in case of an exception.

Sincerely,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #13 on: April 11, 2014, 02:16:38 PM »

Hi Laurent,

I have added the 2 lines to the log file. The result of the linux node is in the attachment.

I have it run also on localhost. Following lines are on the linux node missing:
2014-04-11 13:52:29,132 [INFO ][epc_simulator.jppf.node.MyJobClassPathHandler.<init>(35)]: MyJobClassPathHandler installed
2014-04-11 13:52:29,132 [DEBUG][org.jppf.node.event.LifeCycleEventHandler.loadListeners(243)]: successfully added node life cycle listener epc_simulator.jppf.node.MyJobClassPathHandler

How get he the information to start the classPathHandler. The Handler is stored as a jar on drivers lib directory.
Is the Handler transfered from driver to node?

Christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #14 on: April 11, 2014, 06:31:56 PM »

Hi Christian,

Yes the MyJobClassPathHandler class is downloaded by the node from the driver. I'm puzzled as to why it works with your Windows node and not with the Linux one.
Could you add the following line to the node's node-log4j.properties:
Code: [Select]
log4j.logger.org.jppf.node.classloader=DEBUGThis will add a whole load of logging for the node class loading code. Then you would need to check for the parts where the class loader receives a request for a resource named "org/jppf/node/event/NodeLifeCycleListener" or "org.jppf.node.event.NodeLifeCycleListener", so we can see which class loader is used and what it receives from the server.

Can you also check (again) that in your jar in the driver's lib folder, you have the META-INF/services/org.jppf.node.event.NodeLifeCycleListener and that it contains "epc_simulator.jppf.node.MyJobClassPathHandler" ?

Also, do you have, in the node's lib folder, any jppf-*.jar file other than jppf-common-node.jar, in particular jppf-common.jar or jppf-server.jar ?

Thanks,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #15 on: April 12, 2014, 11:57:34 AM »

Hi Laurent,

I have added the required line on log4j-node.properties. This file and the resulting jppf-node.log are in the attechment. The log has no changes and there are no entries that include the strings "org/jppf/node/event/NodeLifeCycleListener" or "org.jppf.node.event.NodeLifeCycleListener".

the node lib folder has the entries: jmxremote_optional-1.0_01-ea.jar, jppf-common-node.jar, log4j-1.2.15.jar, slf4j-api-1.6.1.jar, slf4j-log4j12-1.6.1.jar

The content of my jar in drivers lib is included in the src.zip of an earlyer note of me. The actual jar is also in the attachment.

I have no idea, how the communication between server and client works and how I can check it. Is there a place where the missing jar from driver lib directory is stored after there transport to the node?

Make it sence, when I send you my whole project (it is not big) and you try to run it in one of your distributed invironments, with different computers for client, server and node? I am not sure that the upd comunication is working, that you need for the enabeling mechanisims.

Christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #16 on: April 12, 2014, 02:40:30 PM »

Hi Christian,

My mistake, I didn't give you the right package name. It should be:
Code: [Select]
log4j.logger.org.jppf.classloader=DEBUGCan you try again with this? This is expected to add a lot of line to the log.

If you want to send your project, I will give it a try, and hopefully reproduce the same issue.

Thanks,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #17 on: April 12, 2014, 05:27:47 PM »

Hi Laurent,

the new log file is in the attachment.
The interesting lines are:
2014-04-12 16:37:28,020 [DEBUG][org.jppf.classloader.AbstractJPPFClassLoader.findResources(276)]: JPPFClassLoader[id=1, type=server, uuidPath=[], offline=false, classpath=] resource [META-INF/services/org.jppf.node.event.NodeLifeCycleListener] not found locally, attempting remote lookup
2014-04-12 16:37:28,020 [DEBUG][org.jppf.classloader.ClassLoaderRequestHandler.run(157)]: sending batch of 1 class loading requests: CompositeResourceWrapper[resources=[JPPFResourceWrapper[dynamic=false, name=META-INF/services/org.jppf.node.event.NodeLifeCycleListener, state=NODE_REQUEST, uuidPath=TraversalList[position=-1, list=[]], callableID=-1]]]
2014-04-12 16:37:28,025 [DEBUG][org.jppf.classloader.ClassLoaderRequestHandler.run(163)]: got response CompositeResourceWrapper[resources=[JPPFResourceWrapper[dynamic=false, name=META-INF/services/org.jppf.node.event.NodeLifeCycleListener, state=NODE_RESPONSE, uuidPath=TraversalList[position=-1, list=[]], callableID=-1]]]
2014-04-12 16:37:28,026 [DEBUG][org.jppf.classloader.AbstractJPPFClassLoader.findRemoteResources(317)]: JPPFClassLoader[id=1, type=server, uuidPath=[], offline=false, classpath=]resource [META-INF/services/org.jppf.node.event.NodeLifeCycleListener] found remotely

This means, he has found a NodeLifecycleListener and is he loaded. On localhost he has found also 4 entries.

My project can you find at http://www.wi-bw.tfh-wildau.de/~drmue/JPPF/JPPF_2/

Thanks

Christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #18 on: April 14, 2014, 09:32:09 AM »

Hello Christian,

I've tried your project, and for me it worked in both cases, with a node on Windows (Win 8 pro) and with a node on Linux (Centos 6.2).
The node console output prints the following:
Code: [Select]
[lcohen@loloCentOS JPPF-4.0.2-node]$ ./startNode.sh
node process id: 3574
Attempting connection to the class server at 192.168.1.15:11111
RemoteClassLoaderConnection: Reconnected to the class server
JPPF Node management initialized
Attempting connection to the node server at 192.168.1.15:11111
Reconnected to the node server
MyJobClassPathHandler installed
Node successfully initialized
cmdGen: begin of Experiment
cmdGen: begin of Experiment
cmdGen: begin of Experiment
cmdGen: begin of Experiment
***** DESMO-J version 2.3.5 *****
Duration_Test starts at simulation time 07.01.2013 08:00:00:000 +0000 UTC.
 ...please wait...
***** DESMO-J version 2.3.5 *****
Duration_Test starts at simulation time 07.01.2013 08:00:00:000 +0000 UTC.
 ...please wait...
***** DESMO-J version 2.3.5 *****
Duration_Test starts at simulation time 07.01.2013 08:00:00:000 +0000 UTC.
 ...please wait...
***** DESMO-J version 2.3.5 *****
Duration_Test starts at simulation time 07.01.2013 08:00:00:000 +0000 UTC.
 ...please wait...
cmdGen: begin of Experiment
***** DESMO-J version 2.3.5 *****
Duration_Test starts at simulation time 07.01.2013 08:00:00:000 +0000 UTC.
 ...please wait...
Duration_Test stopped at simulation time 07.01.2013 17:00:00:000 +0000 UTC.
cmdGen: End of Experiment

The only change I made is in the client code to copy the jars into "JPPF-4.0.2-node/tmplib" instead of "/tmplib". Then the content of the tmplib folder is "Daten-2013-11.jar, Duration_Test.jar, desmoj-2.3.5-complete-bin.jar, EpcSimTools_20140115.jar" and those are the jars found in "_simulation.zip". I also created the "tmplib" folder under the JPPF node install root, since it is not created automatically.

So what I'm suspecting is that somehow "myClasspathNodeListener.jar" is not in your Linux driver's classpath, which prevents the Linux node from loading the life cycle listener. could you post the content of the "lib" folder of both your Windows and Linux driver, for sanity check?

Another thing I observed, by adding "log4j.logger.org.jppf.utils.JPPFDefaultUncaughtExceptionHandler=DEBUG" to the node's log4j configuration, in some of the threads created by your tasks in the node, exceptions similar to this are thrown:
Code: [Select]
2014-04-14 08:38:02,181 [DEBUG][org.jppf.utils.JPPFDefaultUncaughtExceptionHandler.uncaughtException(44)]: Uncaught exception in thread Thread[Kunde2#1#2,5,Duration_Test] : java.lang.NullPointerException
  at desmoj.core.simulator.TimeInstant.isBefore(Unknown Source)
  at desmoj.core.simulator.EventTreeList.insertAsFirst(Unknown Source)
  at desmoj.core.simulator.Scheduler.scheduleAfter(Unknown Source)
  at desmoj.core.simulator.SimProcess.activateAfter(Unknown Source)
  at desmoj.core.advancedModellingFeatures.CondQueue.activateAsNext(Unknown Source)
  at desmoj.core.advancedModellingFeatures.CondQueue.signal(Unknown Source)
  at model.Duration_Test.Logic_EPK_AND$Epc__id_95_Event_Nein.lifecycle(Unknown Source)
  at model.Duration_Test.Master_Kunde2.lifeCycle(Unknown Source)
  at desmoj.core.simulator.SimThread.run(Unknown Source)
This causes the tasks to never terminate. It's completely unrelated to the class laoding problem and most likely unrelated to JPPF as well, I just wanted to make sure this was known.

I'm attaching the generated jppf-node.log, console output and content of the tmplib folder for your convenience.

Sincerely,
-Laurent
« Last Edit: April 14, 2014, 08:38:32 PM by lolo »
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #19 on: April 14, 2014, 11:31:49 PM »

Hi Laurent,

thanks a lot for your test. I see you are using the startNode.sh script to start the node.
In my tests before I was using the JPPFNode wrapper. So I was trying the startNode.sh script.
With this start script the test was running. Please see the attachment.

One point that I not understand is  that the transfered jars are stored in the current node directory as D:\tmplib\Daten-2013-11.jar, ... Why is it stored there and not in /tmplib ?

When I using JPPFNode to start the node it is not working? Is it possible that in this case the jar's are stored at the wrong location?
Laurent, please try the test again with using JPPFNode to start the node.

Thanks

Christian
Logged

lolo

  • Administrator
  • JPPF Council Member
  • *****
  • Posts: 2272
    • JPPF Web site
Re: portation of a win prototype to a linux cluster
« Reply #20 on: April 15, 2014, 12:56:45 AM »

Hi Christian,

I tried with the java service wrapper and "./JPPFNode start" and it worked for me.
On the client side, I changed the file locations as follows:
Code: [Select]
for(String archiv : this.archives){
  File jar           = new File(modelDir, archiv);
  Location<String> location   = new FileLocation(jar);
  Location<byte[]> memLocation = new MemoryLocation(location.toByteArray());
  Location<String> destination = new FileLocation("tmplib/" + archiv);
  classpath.add(archiv, memLocation, destination);
}
and I also created the "tmplib" folder under the node's root install folder.

Quote
One point that I not understand is  that the transfered jars are stored in the current node directory as D:\tmplib\Daten-2013-11.jar, ... Why is it stored there and not in /tmplib ?
I believe it is because in your code you use new FileLocation(new File("/tmplib/" + archiv)). When it is created with this constructor, the FileLocation will transform the path into a string using File.getCanonicalPath(), which returns an OS-dependent path. If instead you use the constructor FileLocation("/tmplib/" + archiv), this will not happen and the path will be valid on the Linux system.

Sincerely,
-Laurent
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #21 on: April 15, 2014, 09:24:32 AM »

Hi Laurent,

thanks for your hints. I changed the file location in the client to
         Location<String> destination = new FileLocation("/tmplib/" + archiv);

Then it works with ./startNode.sh and  ./JPPFNode start. But when the current directory is roots home and I start JPPFNode with its absolute path, then the execution is broken because he cant find the jars.

I use this way to start all nodes via ssh with a script runnig on the driver.

thanks
Christian
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #22 on: April 15, 2014, 05:39:04 PM »

Hi Laurent,

I have two questions:
1. When I start JPPFNode in its install directory with ./JPPFNode it works well, but when I start it from another directory with its absolute path then he starts JPPFNode but he do not start my individual node listener (see my last reply). In the documentation only a call from the install directory with ./JPPFNode is discussed.

2. In my example the transfer of the jar's in jobs claspath needs several minutes. Is it possible to check if the jars already copied to the node(i.e. by name and checksum)? When I only copy the new jar's from client to node, then I can speed up the application dramatically.

Thanks
Christian
Logged

cm

  • JPPF Master
  • ***
  • Posts: 38
Re: portation of a win prototype to a linux cluster
« Reply #23 on: April 15, 2014, 07:47:22 PM »

Hi Laurent,

is it possible to measure the required time for data transfer between client and node special for
- jars in the classpath
- and results

Thanks

Christian
Logged
Pages: [1]   Go Up
 
JPPF Powered by SMF 2.0 RC5 | SMF © 2006–2011, Simple Machines LLC Get JPPF at SourceForge.net. Fast, secure and Free Open Source software downloads