Creating a custom load-balancer
From JPPF 5.2 Documentation
|
Main Page > Customizing JPPF > Creating a custom load-balancer |
1 Overview of JPPF load-balancing
Load-balancing in JPPF relates to the way jobs are split into sub-jobs and how these sub-jobs are disptached to the nodes for execution in parallel. Each sub-job contains a distinct subset of the tasks in the original job.
The distribution of the tasks to the nodes is performed by the JPPF driver. This work is actually the main factor of the observed performance of the framework. It consists essentially in determining how many tasks will go to each node for execution, out of a set of tasks sent by the client application. Each set of tasks sent to a node is called a "bundle", and the role of the load balancing (or task scheduling) algorithm is to optimize the performance by adjusting the number of task sent to each node. In short: it is about computing the optimal bundle size for each node.
Each load-balancing algorithm is encapsulated within a class implementing the interface Bundler, defined as follows:
public interface Bundler { // Get the latest computed bundle size public int getBundleSize(); // Feed the bundler with the latest execution result for the corresponding node public void feedback(int nbTasks, double totalTime); // Make a copy of this bundler public Bundler copy(); // Get the timestamp at which this bundler was created public long getTimestamp(); // Release the resources used by this bundler public void dispose(); // Perform context-independant initializations public void setup(); // Get the parameters profile used by this load-balancer public LoadBalancingProfile getProfile(); }
In practice, it will be more convenient to extend the abstract class AbstractBundler, which provides a default implementation for each method of the interface.
The load balancing in JPPF is feedback-driven. The server will create a Bundler instance for each node that is attached to it. When a set of tasks returns from a node after execution, the server will call the bundler's feedback() method so the bundler can recompute the bundle size with up-to-date data. Whether each bundler computes the bundle size independantly from the other bundlers is entirely up to the implementor. Some of the JPPF built-in algorithms do perform independent computations, others don't.
A bundler's life cycle is as follows:
- when the server starts up, it creates a bundler instance based on the load-balancing algorithm specified in the configuration file
- each time a node connects to the server, the server will make a copy of the initial bundler, using the copy() method, call the setup() method, and assign the new bundler to the node
- when a node is disconnected, the server will call the dispose() method on the corresponding bundler, then discard it
- when the load balancing settings are changed using the management APIs or the administration console, the server will create a new initial Bundler instance, based on the new parameters.
Then, each time the server needs to provide feedback data from a node, the server will compare the creation timestamps of the initial bundler and of the node's bundler. If the server determines that the node's bundler is older, it will replace it with a copy of the initial bundler, using the copy() method and after calling the setup() method on the new bundler
Each bundler has an associated load balancing profile, which encapsulates the parameters of the algorithm. These parameters can be read from the JPPF configuration file, or from any other source. Using a profile is not mandatory, in this case you can just have the getProfile() method return a null value.
In the following sections, we will see in details how to implement a custom load-balancing algorithm, deploy it, and plug it into the JPPF server. We will do this by example, using the built-in “Fixed Size” algorithm, which is simple enough for our purpose.
Note 1: all JPPF built-in load balancing algorithms are implemented and plugged-in as custom algorithms
Note 2: for a fully detailed explanation of how load balancers work in JPPF, please read the Load Balancing section
2 The algorithm and its associated profile
First let's implement our parameters profile. To this effect, we implement the interface LoadBalancingProfile:
public interface LoadBalancingProfile extends Serializable { // Make a copy of this profile public LoadBalancingProfile copy(); }
As we can see, this interface has a single method that creates a copy of a profile. Now let's see how it is implemented in the FixedSizeProfile class:
// Profile for the fixed bundle size load-balancing algorithm public class FixedSizeProfile implements LoadBalancingProfile { // The bundle size private int size = 1; // Default constructor public FixedSizeProfile() { } // Initialize this profile with values read from the specified configuration public FixedSizeProfile(TypedProperties config) { size = config.getInt("size", 1); } // Make a copy of this profile public LoadBalancingProfile copy() { FixedSizeProfile other = new FixedSizeProfile(); other.setSize(size); return other; } // Get the bundle size public int getSize() { return size; } // Set the bundle size public void setSize(int size) { this.size = size; } }
This implementation is fairly trivial, the only notable element being the constructor taking a TypedProperties parameter, which will allow us to read the size parameter from the JPPF configuration file.
Now let's take a look at the algorithm implementation itself:
public class FixedSizeBundler extends AbstractBundler { // Initialize this bundler public FixedSizeBundler(LoadBalancingProfile profile) { super(profile); } // This method always returns a statically assigned bundle size public int getBundleSize() { return ((FixedSizeProfile) profile).getSize(); } // Make a copy of this bundler public Bundler copy() { return new FixedSizeBundler(profile.copy()); } // Get the max bundle size that can be used for this bundler protected int maxSize() { return -1; } }
The first thing we can notice is that the feedback() method is not even implemented! This is due to the fact that our algorithm is independent from the context and involves no computation. Thus, we use the default implementation in AbstractBundler, which does nothing. This is visible in the getBundleSize() method, where we simply return the value provided in the parameters profile.
We also notice a new method named maxSize(). It returns a value representing the maximum bundle size that a bundler can use at a given time. The goal of this is to avoid that a node receives all or most of the tasks, while the other nodes would not receive anything and thus would have nothing to do. This method is declared in the abstract class AbstractBundler and doesn't have any default implementation, to avoid any tight coupling between the bundler and the environment in which it runs. This allows the bundler to be used outside of the JPPF server, as is done for instance in the JPPF client when local execution mode is used along with remote execution.
In the context of the server, we have found that an efficient value for maxSize() can be computed from the current maximum number of tasks among all the jobs in the server queue. This value is accessible by calling the method JPPFQueue.getMaxBundleSize(). We could then rewrite our maxSize() method as follows:
protected int maxSize() { return JPPFDriver.getQueue().getMaxBundleSize() / 2; }
The algorithm could then determine that a node should not receive more than half of that value (or 75% or any other function of it, whatever is deemed more efficient), so that other nodes will not be idle and the overall throughput will be optimized.
Tip: if your algorithm depends on the number of nodes, you can use a bundler instances count as a static variable in your implementation, and use the setup() and dispose() methods to increment and decrement the count as needed. For instance:
private static AtomicInteger instanceCount = new AtomicInteger(0); public void setup() { instanceCount.incrementAndGet(); } public void dispose() { instanceCount.decrementAndGet(); }
3 Implementing the bundler provider interface
Custom load-balancers are defined and deployed using the Service Provider Interface (SPI) mechanism. For a new load-balancer to be recognized by JPPF, it has to provide an implementation of the JPPFBundlerProvider interface, which is defined as:
public interface JPPFBundlerProvider { // Get the name of the algorithm defined by this provider // Each algorithm must have a name distinct from that of all other algorithms public String getAlgorithmName(); // Create a bundler instance using the specified parameters profile public Bundler createBundler(LoadBalancingProfile profile); // Create a bundler profile containing the parameters of the algorithm public LoadBalancingProfile createProfile(TypedProperties configuration); }
In the case of our fixed size algorithm, the FixedSizeBundlerProvider implementation is quite straightforward:
public class FixedSizeBundlerProvider implements JPPFBundlerProvider { // Get the name of the algorithm defined by this provider public String getAlgorithmName() { return "manual"; } // Create a bundler instance using the specified parameters profile public Bundler createBundler(LoadBalancingProfile profile) { return new FixedSizeBundler(profile); } // Create a bundler profile containing the parameters of the algorithm public LoadBalancingProfile createProfile(TypedProperties configuration) { return new FixedSizeProfile(configuration); } }
4 Deploying the custom load-balancer
For our custom load-balancer to be recognized and loaded, we need to create the corresponding service definition file. If it doesn't already exist, we create, in the source folder, a subfolder named META-INF/services. In this folder, we will create a file named org.jppf.load.balancer.spi.JPPFBundlerProvider, and open it in a text editor. In the editor, we add a single line containing the fully qualified name of our provider implementation:
org.jppf.load.balancer.spi.FixedSizeBundlerProvider
Now, to actually deploy our implementation, we will create a jar file that contains all the artifacts we have created: the Bundler, LoadBalancingProfile and JPPFBundlerProvider implementation classes, along with the META-INF/services folder, and add this jar to the class path of the server.
5 Node-aware load balancers
Load balancers can be made aware of a node's environment and configuration, and make dynamic decisions based on this information.
To this effect, the Bundler implementation will need to also implement the interface NodeAwareness, defined as follows:
// Bundler implementations should implement this interface // if they wish to have access to a node's configuration public interface NodeAwareness { // Get the corresponding node's system information JPPFSystemInformation getNodeConfiguration(); // Set the corresponding node's system information void setNodeConfiguration(JPPFSystemInformation nodeConfiguration); }
When implementing this interface, the environment and configuration of the node become accessible via an instance of JPPFSystemInformation.
JPPF guarantees that the node information will never be null once the node is connected to the server. You should not assume, however, that it is true when the Bundler is instantiated (for instance in the constructor).
The method setConfiguration() can be called in two occasions:
- when the node connects to the server
- when the node's number of processing threads has been updated dynamically (through the admin console or management APIs)
A sample usage of NodeAwareness can be found in the CustomLoadBalancer sample, in the JPPF samples pack.
6 Job-aware load balancers
Load-balancers can gain access to information on a job via the JPPFDistributedJob interface. This is done by having the Bundler implement the JobAwarenessEx interface, defined as follows:
// Bundler implementations should implement this interface // if they wish to have access to a job's metadata public interface JobAwarenessEx { // Get the current job information JPPFDistributedJob getJob(); // Set the current job void setJob(JPPFDistributedJob job); }
The method setJob() is always called after the execution policy (if any) has been applied to the node, and before the job is dispatched to the node for execution. This allows the load-balancer to use information about the job when computing the number of tasks to send to the node.
An example usage of JobAwarenessEx can be found in the CustomLoadBalancer sample, in the JPPF samples pack.
7 Job-aware load balancers: the deprecated JobAwareness interface
Deprecation notice: this API si deprecated as of JPPF 5.1 and will be removed in a future version. You should use the JobAwarenessEx interface instead, as described in the section above.
Load-balancers can gain access to a job's metadata (see the “Job Metadata” section of the Development Guide).
This is done by having the Bundler implement the interface JobAwareness, defined as follows:
// Bundler implementations should implement this interface // if they wish to have access to a job's metadata public interface JobAwareness { // Get the current job's metadata JobMetadata getJobMetadata(); // Set the current job's metadata void setJobMetadata(JobMetadata metadata); }
When implementing this interface, the job metadata becomes accessible via an instance of JobMetadata.
The method setJobMetadata() is always called after the execution policy (if any) has been applied to the node, and before the job is dispatched to the node for execution. This allows the load-balancer to use information about the job when computing the number of tasks to send to the node.
Main Page > Customizing JPPF > Creating a custom load-balancer |