JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 

Server management

From JPPF 2.5 Documentation

Jump to: navigation, search

Contents

Main Page > Management and monitoring > Server management

Out of the box in JPPF 2.0, each server provides 2 MBeans that can be accessed remotely using an RMI remote connector with the JMX URL “service:jmx:rmi:///jndi/rmi://host:port/jppf/driver”, where host is the host name or IP address of the machine where the server is running (value of “jppf.management.host” in the server configuration file), and port is the value of the property “jppf.management.port” specified in the server's configuration file.

1 Server-level management and monitoring

MBean name: “org.jppf:name=admin,type=driver

This is also the value of the constant JPPFAdminMBean.DRIVER_MBEAN_NAME.

This MBean's role is to perform management and monitoring at the server level. It exposes the JPPFAdminMBean interface, which provides the functionalities described hereafter.

1.1 Server statistics

You can get a snapshot of the server's state by invoking the following method, which porvides statistics on execution performance, network overhead, server queue behavior, number of connected nodes and clients:

 /**
  * Get the latest statistics snapshot from the JPPF driver.
  * @return a <code>JPPFStats</code> instance.
  * @throws Exception if any error occurs.
  */
 public JPPFStats statistics() throws Exception;

This method returns an object of type JPPFStats. which exposes the following accessors:

 // total number of tasks executed
 public int getTotalTasksExecuted()
 // time statistics for the tasks execution,
 // includes network transport and node execution time
 public TimeSnapshot getExecution()
 // time statistics for execution within the nodes
 public TimeSnapshot getNodeExecution()
 // time statistics for the network transport between nodes and server
 public TimeSnapshot getTransport()
 // time statistics for the server overhead
 public TimeSnapshot getServer()
 // time statistics for the queued tasks
 public TimeSnapshot getQueue()
 // total number of tasks that have been queued
 public int getTotalQueued()
 // number of tasks in the queue
 public int getQueueSize()
 // peak queue size
 public int getMaxQueueSize()
 // current number of nodes connected to the server
 public int getNbNodes()
 // peak number of nodes connected to the server
 public int getMaxNodes()
 // the current number of clients connected to the server
 public int getNbClients()
 // peak number of clients connected to the server
 public int getMaxClients()

Some of these methods return an instance of the class TimeSnapshot, that encapsulates multiple aspects of time-related statistics. It exposes the following methods:

 // total cumulated time
 public long getTotalTime()
 // latest observed time
 public long getLatestTime()
 // smallest observed time
 public long getMinTime()
 // peak time
 public long getMaxTime()
 // average time
 public double getAvgTime()

1.2 Stopping and restarting the server

 /**
  * Perform a shutdown or restart of the server.
  * @param shutdownDelay - the delay before shutting down the server,
  * once the command is received. 
  * @param restartDelay - the delay before restarting, once the server is shutdown.
  * If the value is negative, no restart occurs, the server simply shuts down.
  * @return an acknowledgement message.
  * @throws Exception if any error occurs.
  */
 public String restartShutdown(Long shutdownDelay, Long restartDelay) throws Exception;

This method allows you to remotely shut down the server, and eventually to restart it after a specified delay. This can be useful when an upgrade or maintenance of the server must take place within a limited time window.

1.3 Managing the nodes attached to the server

 /**
  * Request the JMX connection information for all the nodes attached to the server.
  * @return a collection of <code>JPPFManagementInfo</code> instances.
  * @throws Exception if any error occurs.
  */
 public Collection<JPPFManagementInfo> nodesInformation() throws Exception;

The JPPFManagementInfo objects returned in the resulting collection encapsulate enough information to connect to the corresponding node's MBean server:

 public class JPPFManagementInfo
   implements Serializable, Comparable<JPPFManagementInfo>
 {
   // the host on which the node is running
   public String getHost()
 
   // the port on which the node's JMX server is listening
   public int getPort()
 }

For example, based on what we saw in the section about nodes management, we could write code that gathers connection information for each node attached to a server, and then performs some management request on them:

 // Obtain connection information for all attached nodes
 Collection<JPPFManagementInfo> nodesInfo = myDriverMBeanProxy.nodesInformation();
 // for each node
 for (JPPFManagementInfo info: nodesInfo)
 {
   // create a JMX connection wrapper based on the node information
   JMXNodeConnectionWrapper wrapper =
     new JMXNodeConnectionWrapper(info.getHost(), info.getPort());
   // connect to the node's MBean server
   wrapper.connectAndWait(5000);
   // restart the node
   wrapper.restart();
 }

1.4 Load-balancing settings

The driver management MBean provides two methods to dynamically obtain and change the server's load balancing settings:

 /**
  * Obtain the current load-balancing settings.
  * @return an instance of <code>LoadBalancingInformation</code>.
  * @throws Exception if an error occurred while fetching the settings.
  */
 public LoadBalancingInformation loadBalancerInformation() throws Exception;

This method returns an object of type LoadBalancingInformation, defined as follows:

 public class LoadBalancingInformation implements Serializable
 {
   // the name of the algorithm currently used by the server
   public String algorithm = null;
   // the algorithm's parameters
   public TypedProperties parameters = null;
   // the names of all algorithms available to the server
   public List<String> algorithmNames = null;
 }

Notes:

  • the value of algorithm is included in the list of algorithm names
  • parameters contains a mapping of the algorithm parameters names to their current value. Unlike what we have seen in the configuration guide chapter, the parameter names are expressed without suffix. This means that instead of strategy.<profile_name>.<parameter_name>, they will just be named as <parameter_name>.

It is also possible to dynamically change the load-balancing algorithm used by the server, and / or its parameters:

 /**
  * Change the load-balancing settings.
  * @param algorithm - the name opf the load-balancing algorithm to set.
  * @param parameters - the algorithm's parameters.
  * @return an acknowledgement or error message.
  * @throws Exception if an error occurred while updating the settings.
  */
 public String changeLoadBalancerSettings(String algorithm, Map parameters)
   throws Exception;

Where:

  • algorithm is the name of the algorithm to use. If it is not known to the server, no change occurs.
  • parameters is a map of algorithm parameter names to their value. Similarly to what we saw above, the parameter names must be expressed without suffix. Internally, the JPPF server will use a the profile name “jppf”.

2 Job-level management and monitoring

MBean name: “org.jppf:name=jobManagement,type=driver

This is also the value of the constant JPPFAdminMBean.DRIVER_JOB_MANAGEMENT_MBEAN_NAME.

The role of this MBean is to control and monitor the life cycle of all jobs submitted to the server. It exposes the DriverJobManagementMBean interface, defined as follows:

 public interface DriverJobManagementMBean extends NotificationEmitter
 {
   // Cancel the job with the specified id
   public void cancelJob(String jobId) throws Exception;
   // Suspend the job with the specified id
   public void suspendJob(String jobId, Boolean requeue) throws Exception;
   // Resume the job with the specified id
   public void resumeJob(String jobId) throws Exception;
   // Update the maximum number of nodes a job can run on
   public void updateMaxNodes(String jobId, Integer maxNodes) throws Exception;
   // Get the set of ids for all the jobs currently queued or executing
   public String[] getAllJobIds() throws Exception;
   // Get an object describing the job with the specified id
   public JobInformation getJobInformation(String jobId) throws Exception;
   // Get a list of objects describing the nodes to which the whole
   // or part of a job was dispatched
   public NodeJobInformation[] getNodeInformation(String jobId) throws Exception;
 }

Reminder:

A job can be made of multiple tasks. These tasks may not be all executed on the same node. Instead, the set of tasks may be split in several subsets, and these subsets can in turn be disptached to different nodes to allow their execution in parallel. In the remainder of this section we will call each subset a “sub-job”, to distinguish them from actual jobs at the server level. Thus a job is associated with a server, whereas a sub-job is associated with a node.

2.1 Controlling a job's life cycle

It is possible to terminate, suspend and resume a job using the following methods:

 /**
  * Cancel the job with the specified id.
  * @param jobId the id of the job to cancel.
  * @throws Exception if any error occurs.
  */
 public void cancelJob(String jobId) throws Exception;

This will terminate the job with the specified jobId. Any sub-job running in a node will be terminated as well. If a sub-job was partially executed (i.e. at least one task execution was completed), the results are discarded. If the job was still waiting in the server queue, is simply removed from the queue, and the enclosed tasks are returned in their original state to the client.

 /**
  * Suspend the job with the specified id.
  * @param jobId the id of the job to suspend.
  * @param requeue true if the sub-jobs running on each node should be canceled
  * and requeued, false if they should be left to execute until completion.
  * @throws Exception if any error occurs.
  */
 public void suspendJob(String jobId, Boolean requeue) throws Exception;

This method will suspend the job with the specified jobId. The requeue parameter specifies how the currently running sub-jobs will be processed:

  • if true, then the sub-job is canceled and inserted back into the server queue, for execution at a later time
  • if false, JPPF will let the sub-job finish executing in the node, then suspend the rest of the job still in the server queue

If the job is already suspended, then calling this method has no effect.

 /**
  * Resume the job with the specified id.
  * @param jobId the id of the job to resume.
  * @throws Exception if any error occurs.
  */
 public void resumeJob(String jobId) throws Exception;

This method resumes the execution of a suspended job with the specified jobId. If the job was not suspended, this method has no effect.

2.2 Number of nodes assigned to a job

 /**
  * Update the maximum number of nodes a job can run on.
  * @param jobId the id of the job to update.
  * @param maxNodes the new maximum number of nodes for the job.
  * @throws Exception if any error occurs.
  */
 public void updateMaxNodes(String jobId, Integer maxNodes) throws Exception;

This method specifies the maximum number of nodes a job with the specified jobId can run on in parallel. It does not guarantee that this number of nodes will be used: the nodes may already be assigned to other jobs, or the job may not be splitted into that many sub-jobs (depending on the load-balancing algorithm). However it does guarantee that no more than maxNodes nodes will be used to execute the job.

2.3 Job introspection

 /**
  * Get the set of ids for all the jobs currently queued or executing.
  * @return an array of ids as strings.
  * @throws Exception if any error occurs.
  */
 public String[] getAllJobIds() throws Exception;

This methods returns the IDs of all the jobs currently handled by the server. These IDs can be directly used with the other methods of the job management MBean.

 /**
  * Get an object describing the job with the specified id. 
  * @param jobId the id of the job to get information about.
  * @return an instance of <code>JobInformation</code>.
  * @throws Exception if any error occurs.
  */
 public JobInformation getJobInformation(String jobId) throws Exception;

Retrieves information about the state of a job in the server. This method returns an object of type JobInformation, defined as follows:

 public class JobInformation implements Serializable
 {
   // the unique identifier for the job
   public String getJobId()
   // the current number of tasks in the job or sub-job
   public int getTaskCount()
   // the priority of this task bundle
   public int getPriority()
   // the initial task count of the job (at submission time)
   public int getInitialTaskCount()
   // determine whether the job is in suspended state
   public boolean isSuspended()
   // set the maximum number of nodes this job can run on
   public int getMaxNodes()
   // the pending state of the job
   // a job is pending if its scheduled execution date/time has not yet been reached
   public boolean isPending()
 }

It is also possible to obtain information about all the sub-jobs of a job that are disptached to remote nodes:

 /**
  * Get a list of objects describing the sub-jobs of a job, and the nodes to which
  * they were dispatched.
  * @param jobId the id of the job for which to find the information.
  * @return an array of <code>NodeJobInformation</code> instances.
  * @throws Exception if any error occurs.
  */
 public NodeJobInformation[] getNodeInformation(String jobId) throws Exception;

The return value is an array of objects of type NodeJobInformation, defined as follows:

 public class NodeJobInformation implements Serializable
 {
   // The JMX connection information for the node
   public final JPPFManagementInfo nodeInfo;
 
   // The information about the sub-job
   public final JobInformation jobInfo;
 }

This class is simply a grouping of two objects of type JobInformation and JPPFManagementInfo, which we have already seen previously. The nodeInfo attribute will allow us to connect to the corresponding node's MBean server and obtain additional job monitoring data.

2.4 Job notifications

Whenever a job-related event occurs, the job management MBean will emit a notification of type JobNotification, defined as follows:

 public class JobNotification extends Notification
 {
   // the information about the job or sub-job
   public JobInformation getJobInformation()
 
   // the information about the node (for sub-jobs only)
   // null for a job on the server side
   public JPPFManagementInfo getNodeInfo()
 
   // the creation timestamp for this event
   public long getTimestamp()
 
   // the type of this job event
   public JobEventType getEventType()
 }

The value of the job event type (see JobEventType type safe enumeration) is one of the following:

  • JOB_QUEUED: a new job was submitted to the JPPF driver queue
  • JOB_ENDED: a job was completed and sent back to the client
  • JOB_DISPATCHED: a sub-job was dispatched to a node
  • JOB_RETURNED: a sub job returned from a node
  • JOB_UPDATED: one of the job attributes has changed

3 Accessing and using the server MBeans

As for the nodes, JPPF provides an API that simplifies access to the JMX-based management features of a server, by abstracting most of the complexities of JMX programming. This API is represented by the class JMXDriverConnectionWrapper that provides a simplified way of connecting to the server's MBean server, along with a set of convenience methods to easily access the MBeans' exposed methods and attributes.

Please note that this class implements the JPPFDriverAdminMBean interface, as well as all the methods in the DriverJobManagementMBean interface (but without implementing the interface itself).

3.1 Connecting to an MBean server

Connection to to a server MBean server is done in two steps:

a. Create an instance of JMXDriverConnectionWrapper

To connect to a local (same JVM) MBean server, use the no-arg constructor:

 JMXDriverConnectionWrapper wrapper = new JMXDriverConnectionWrapper();

To connect to a remote MBean server, use the constructor specifiying the management host and port:

 JMXDriverConnectionWrapper wrapper = new JMXDriverConnectionWrapper(host, port);

Here host and port represent the server's configuration properties “jppf.management.host” and “jppf.management.port”

b. Initiate the connection to the MBean server and wait until it is established

There are two ways to do this:

Synchronously:

 // connect and wait for the connection to be established
 // choose a reasonable value for the timeout, or 0 for no timeout
 wrapper.connectAndWait(timeout);

Asynchronously:

 // initiate the connection; this method returns immediately
 wrapper.connect()
 
 // ... do something else ...
 
 // check if we are connected
 if (wrapper.isConnected()) ...;
 else ...;

3.2 Direct use of the JMX wrapper

JMXDriverConnectionWrapper implements directly the JPPFDriverAdminMBean interface, as well as all the methods in the DriverJobManagementMBean interface (but without implementing the interface itself). This means that all the JPPF server's management and monitoring methods can be used directly from the JMX wrapper. For example:

 JMXDriverConnectionWrapper wrapper = new JMXDriverConnectionWrapper(host, port);
 wrapper.connectAndWait(timeout);
 
 // get the ids of all jobs in the server queue
 String jobIds = wrapper.getAllJobIds();
 // stop the server in 2 seconds (no restart)
 wrapper.restartShutdown(2000L, -1L);

3.3 Use of the JMX wrapper's invoke() method

JMXConnectionWrapper.invoke() is a generic method that allows invoking any exposed method of an MBean.

Here is an example:

 JMXDriverConnectionWrapper wrapper = new JMXDriverConnectionWrapper(host, port);
 wrapper.connectAndWait(timeout);
 
 // equivalent to JPPFStats stats = wrapper.statistics();
 JPPFStats stats = (JPPFStats) wrapper.invoke(
   JPPFAdminMBean.DRIVER_MBEAN_NAME, "statistics", (Object[]) null, (String[]) null);
 int nbNodes = stats.getNbNodes();
 
 // get the total CPU time used
 long cpuTime = (Long) wrapper.invoke(JPPFNodeTaskMonitorMBean.TASK_MONITOR_MBEAN_NAME,
   "getTotalTaskCpuTime", (Object[]) null, (String[]) null);

3.4 Use of an MBean proxy

A proxy is a dynamically created object that implements an interface specified at runtime.

The standard JMX API provides a way to create a proxy to a remote or local MBean. This is done as follows:

 JMXDriverConnectionWrapper wrapper = new JMXDriverConnectionWrapper(host, port);
 wrapper.connectAndWait(timeout);
 
 // create the proxy instance
 DriverJobManagementMBean proxy = 
   wrapper.getProxy(JPPFAdminMBean.DRIVER_JOB_MANAGEMENT_MBEAN_NAME, 
                    DriverJobManagementMBean.class);
 
 // get the ids of all jobs in the server queue
 String jobIds = proxy.getAllJobIds();

3.5 Subscribing to MBean notifications

We have seen that the task monitoring MBean represented by the JPPFNodeTaskMonitorMBean interface is able to emit notifications of type TaskExecutionNotification. There are 2 ways to subscribe to these notifications:

a. Using a proxy to the MBean

 JMXDriverConnectionWrapper wrapper = new JMXNodeConnectionWrapper(host, port);
 wrapper.connectAndWait(timeout);
 ObjectName objectName = 
   new ObjectName(JPPFAdminMBean.DRIVER_JOB_MANAGEMENT_MBEAN_NAME);
 MBeanServerConnection mbsc = wrapper.getMbeanConnection();
 DriverJobManagementMBean proxy = (DriverJobManagementMBean) 
   MBeanServerInvocationHandler.newProxyInstance(
     mbsc, new ObjectName(objectName), DriverJobManagementMBean.class, true);
 
 // subscribe to all notifications from the MBean
 proxy.addNotificationListener(myJobNotificationListener, null, null);

b. Using the MBeanServerConnection API

 JMXDriverConnectionWrapper wrapper = new JMXDriverConnectionWrapper(host, port);
 wrapper.connectAndWait(timeout);
 MBeanServerConnection mbsc = wrapper.getMbeanConnection();
 ObjectName objectName = 
   new ObjectName(JPPFAdminMBean.DRIVER_JOB_MANAGEMENT_MBEAN_NAME);
 
 // subscribe to all notifications from the MBean
 mbsc.addNotificationListener(objectName, myNotificationListener, null, null);

Here is an example notification listener implementing the NotificationListener interface:

 // this class prints a message each time a job is added to the server's queue
 public class MyJobNotificationListener implements NotificationListener
 {
   // Handle an MBean notification
   public void handleNotification(Notification notification, Object handback)
   {
     JobNotification jobNotif = (JobNotification) notification;
     JobEventType eventType = jobNotif.getEventType();
     // print a message for new jobs only
     if (eventType.equals(JobEventType.JOB_QUEUED))
     {
       String jobId = jobNotif.getJobInformation().getJobId();
       System.out.println("job " + jobId + " was queued at timestamp " 
         + jobNotif.getTimestamp());
     }
   }
 };
 
 NotificationListener myJobNotificationListener = new MyJobNotificationListener();

4 Remote logging

It is possible to receive logging messages from a driver as JMX notifications. Specific implementations are available for Log4j and JDK logging.

To configure Log4j to send JMX notifications, edit the log4j configuration files of the node and add the following:

 ### direct messages to the JMX Logger ###
 log4j.appender.JMX=org.jppf.logging.log4j.JmxAppender
 log4j.appender.JMX.layout=org.apache.log4j.PatternLayout
 log4j.appender.JMX.layout.ConversionPattern=%d [%-5p][%c.%M(%L)]: %m\n
 ### set log levels - for more verbose logging change 'info' to 'debug' ###
 log4j.rootLogger=INFO, JPPF, JMX

To configure the JDK logging to send JMX notifications, edit the JDK logging configuration file of the driver as follows:

 # list of handlers
 handlers= java.util.logging.FileHandler, org.jppf.logging.jdk.JmxHandler
 # Write log messages as JMX notifications
 org.jppf.logging.jdk.JmxHandler.level = FINEST
 org.jppf.logging.jdk.JmxHandler.formatter = org.jppf.logging.jdk.JPPFLogFormatter

To receive the logging notifications from a remote application, you can use the following code:

 // get a JMX connection to the node MBean server
 JMXDriverConnectionWrapper jmxDriver = new JMXDriverConnectionWrapper(host, port);
 jmxDriver.connectAndWait(5000L);
 // get a proxy to the MBean
 JmxLogger driverProxy =
   jmxDriver.getProxy(JmxLogger.DEFAULT_MBEAN_NAME, JmxLogger.class);
 // use a handback object so we know where the log messages come from
 String source = "driver " + jmxDriver.getHost() + ":" + jmxDriver.getPort();
 // subscribe to all notifications from the MBean
 NotificationListener listener = new MyLoggingHandler();
 driverProxy.addNotificationListener(listener, null, source);
 
 // Logging notification listener that prints remote log messages to the console
 public class MyLoggingHandler implements NotificationListener
 {
   // handle the logging notifications
   public void handleNotification(Notification notification, final Object handback)
   {
     String message = notification.getMessage();
     String toDisplay = handback.toString() + ": " + message;
     System.out.println(toDisplay);
   }
 }
Main Page > Management and monitoring > Server management

JPPF Copyright © 2005-2020 JPPF.org Powered by MediaWiki