Extended Class Loading sample
What does the sample do?
This sample uses the JPPF class loading extensions to automate the management of a repository of application libraries at runtime, for all the nodes in the grid.
Description of the problem to solve
Some applications require a large number of internal or external libraries to run.
When executed in a JPPF grid, they may incur a significant startup time, due to the loading of a very large number of classes across the network, which is the way JPPF works by default.
Futhermore, this startup overhead may occur every time a change occurs, not only in the application's code but also in any of the libraries it relies on.
A solution to the startup time issue is to deploy the libraries locally on each node.
However, this causes a management and deployment overhead, when one or more of the libraries is added, updated or removed.
When the number of nodes in the grid is large, the overhead of managing the libraries can be prohibitive.
Description of the solution
In this sample, we implement a mechanism that will allow each job to specifiy a set of libraries it needs (jar files) and each node to download these libraries and store them in a local repository,
then add them to the classpath of the JPPF tasks class loader. Each library will be stored along with a computed signature (e.g. MD5 or SHA-256), to help resolve the situations where multiple versions
of a library have the same name.
To achieve this, our implementation will handle two main abstractions:
- A Repository interface, which will be in charge of maintaining the local node repository and downloading missing libraries
when requested by a job. As we can see in the code, this interface has only a few methods, as we tried to keep it simple. There is just enough to perform repository updates and maintenance
- A ClassPath interface, which represents a set of libraries needed by a job and which can be transported within the
Based on these abstractions, we define our mechanism in two distinct sections: node-side behavior and client-side behavior.
We define a repository class RepositoryImpl which encapsulates the following:
- The repository is implemented as a file folder directly under the node's installation root and named 'repository'. This folder has a flat structure and contains a set of jar files.
The jar file names have the format 'file_name-file_signature.jar', where file_name is the original file name without the '.jar' extension and file_signature is the computed MD5 signature for the file.
- The repository also maintains a file named 'toDelete.txt' which contains a list of jar files to delete upon loading of the repository. This is a workaround for the fact that on some OSes (e.g. Windows), the JVM keeps a lock
on the jar files it uses, which makes it impossible to delete them as long as the JVM is alive. Thus this file is used to keep track of the libraries that can't be deleted immediately
The node behavior is driven by an extended node life cycle listener
- On "node starting" events, the node creates a Repository instance and performs cleanup operations. The cleanup mainly consists in deleting the jar files that were marked as deleted but could not be physically
removed from the file system
- The same is performed upon "node ending" events
- On "job header loaded" events, the node reads the metadata associated with the jobs and performs repository updates accordingly:
- if a RepositoryFilter is provided, the repository's delete() method will remove the jar files that match the filter.
This is a repository maintenance operation, which we recommend to perform via a broadcast job
- if a ClassPath is provided, then the node will download the specified libraries that are not in the repository, then add them to the
client class loader's classpath. If necessary, it will create a new class loader instance, without disconnecting from the driver, using the node's resetTaskClassLoader() method.
The NodeLifeCycleListenerEx.jobHeaderLoaded() method
is in fact the only place where this can be done safely
- On "job starting" and "job ending" events, there is no specific processing taking place.
On the client side, we have chosen the following implementation:
- The jar files that will be dynamically downloaded by the nodes are dropped into a file folder named "dynamicLibs". This folder is added to the JPPF client's classpath, but not the jar files it contains.
We need it to be at the client's classpath root, so that the distributed class loader will be able to find it and download its files
- Upon startup, the client application uses a command-line argument which specifies a file pattern, to determine which jar files will part of the jobs dynamic classpath.
From this pattern, it computes the MD5 signature for each matching file and builds a ClassPath instance based on the results.
Example: ./run.sh -c "*1.jar|*2.jar" will cause the aplication to use all jar files whose name ends with '1' or '2'
- Additionally, the client handles an optional command-line argument to specifiy another file pattern, to request that the node deletes the matching files.
This argument is converted into a RepositoryFilter instance, which will be used by the nodes to delete the specified jar files.
For instance: -d "My*1.jar" will tell the node to remove all files whose name starts with "My" and ends with "1" from their repository
- The client application then creates 2 jobs. Each job has a task using a single one of two dynamic libraries, adds the ClassPath and optional RepositoryFilter instances to its metadata,
submits the jobs to the grid, then displays their execution results
How to run the sample
Before running this sample, you need to install a JPPF server and at least one node.
For information on how to set up a node and server, please refer to the JPPF documentation
Once you have installed a server and node, perform the following steps:
- open a command prompt or shell console in JPPF-x.y-samples-pack/ExtendedClassLoading
- build the sample: type "ant jar" or simply "ant"; this will create 3 jar files:
- NodeListener.jar in this sample's root folder. This is our node life cycle listener implementation
- ClientLib1.jar and ClientLib2.jar in the "dynamicLibs" folder (this is the client's dynamic classpath).
These are here for demonstration purposes. Each of these libraries contains a single class used by a JPPF task in one of the submitted jobs.
When running this sample the first time, these classes will initially not be in the classpath of either the client or the node.
However, our repository management mechanism will automatically download these libraries to the node, so the task can be executed without error.
- copy "NodeListener.jar" in the "lib" folder of the JPPF driver installation, to add it to the driver's classpath. This will cause the nodes to download its classes from the server.
- start the server and the node
- Test scenario 1: from the command prompt previously opened, run the sample by typing:
run.bat -c "*.jar" on Windows
./run.sh -c "*.jar" on Linux/Unix
From now on we'll only specifiy the command as "<run>", please substitute the appropriate syntax for your OS.
- in the client's console output, you will see a message indicating that a classpath with our 2 libraries was found, matching the file pattern specified on the command line.
Then, the client shows that the execution of each of our jobs was successful.
- On the node side, the console output shows the requested jars for each job. The jar files are downloaded for the first job, as they are not yet in the repository. Then the task class loader is updated with the downloaded files.
This is a little different with the second job: the jars are not downloaded this time, as they are now in the repository. Also, since the classpath of the class loader is not empty, we create a new one and add the jars to its classpath.
Finally, looking at the content of the node's "repository" folder, we can check that our 2 jars files are effectively present in it.
- Test scenario 2: at the command prompt type: <run> -d "*.jar" to ask the node to delete all jar files in its repository.
- On the client's console output, we see that the jobs produced an error at execution:
This is the expected result, since we didn't specify any classpath for the jobs.
- On the node side, there are two different outcomes, depending on the OS you're on:
- on Linux/Unix systems, the 'repository' folder is now empty, the jar files were effectively deleted
- on Windows systems, 'repository' still contains the 2 jars files, plus one additional file 'toDelete.txt'. The node could not delete the jars because the JVM is holding a lock on them.
Thus the deletion will only occur at the next node restart. If you look into 'toDelete.txt' with a text editor, you will see that it lists our 2 jars files.
- If you are on a Windows system, please kill the node and restart it: you will now see that the node's 'repository' folder is empty, it has been properly cleaned up
- Test scenario 3: at the command prompt type: <run> -c "*2.jar"
- The node output shows that only 'MyLib2.jar' is downloaded from the client. Futhermore, we can see that this jar is now back in the 'repository' folder
- As expected, the client's output will show a NoClassDefFoundError for the 1st job and a successful execution for the 2nd
- Now, you can continue experimenting with various combinations of the -c and -d arguments, by adding different jars in the client's 'dynamic' lib folder, etc...
Commented source files
What features of JPPF are demonstrated?
I have additional questions and comments, where can I go?
If you need more insight into the code of this demo, you can consult the Java source files located in the ExtendedClassLoading/src folder.
In addition, There are 2 privileged places you can go to: