JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net

The open source
grid computing

 Home   About   Features   Download   Documentation   On Github   Forums 

Node configuration

From JPPF 3.3 Documentation

Jump to: navigation, search


Main Page > Configuration guide > Node configuration

1 Server discovery

By default, JPPF nodes are configured to automatically discover active servers on the network. As we have seen in the server discovery configuration , this is possible thanks to the UDP broadcast mechanism of the server. On the other end, the node needs to join the same UDP group to subscribe to the broadcasts from the server, which is done by configuring the following properties:

# Enable or disable automatic discovery of JPPF drivers
jppf.discovery.enabled = true

# UDP multicast group to which drivers broadcast their connection parameters =

# UDP multicast port to which drivers broadcast their connection parameters
jppf.discovery.port = 11111

# How long in milliseconds the node will attempt to automatically discover a driver
# before falling back to the manual configuration parameters
jppf.discovery.timeout = 5000

# IPv4 address inclusion patterns
jppf.discovery.include.ipv4 = 

# IPv4 address exclusion patterns
jppf.discovery.exclude.ipv4 = 

# IPv6 address inclusion patterns
jppf.discovery.include.ipv6 = 

# IPv6 address exclusion patterns
jppf.discovery.exclude.ipv6 = 

For the node to actually find a server on the network, the values for the group and port must be the same for a node and at least one server. If multiple servers are found on the network, the node will arbitrarily pick one.

Note the property jppf.discovery.timeout: if defines a fall back strategy that wil cause the node to connect to the server defined in the manual configuration parameters after the specified time.

The last four properties define inclusion and exclusion patterns for IPv4 and IPv6 addresses. Each of them defines a list of comma- or semicolumn- separated patterns. For the syntax of the IPv4 patterns, please refer to the Javadoc for the class IPv4AddressPattern, and to IPv6AddressPattern for IPv6 patterns syntax. This enables filtering out unwanted IP addresses: the discovery mechanism will only allow addresses that are included and not excluded.

Let's take for instance the following pattern specifications:

jppf.discovery.include.ipv4 = 192.168.1.
jppf.discovery.exclude.ipv4 =

The inclusion pattern only allows IP addresses in the range, ..., The exclusion pattern filters out IP addresses in the range, ..., Thus, we actually defined a filter that only accepts addresses in the range, ...,

Instead of these 2 patterns, we could have simply defined the following equivalent inclusion pattern:

jppf.discovery.include.ipv4 =

2 Manual network configuration

If server discovery is disabled, network access to a server must be configured manually. To this effect, the node requires the address of the host on which the server is running, and a TCP port, as shown in this example:

# IP address or host name of the server = my_host
# JPPF server port
jppf.server.port = 11111

To not define these properties is equivalent to assigning them their default value (i.e. “localhost” for the host address, 11111 for the server port).

Backward compatibility with JPPF v2.x: To avoid too much disruption in applications configured for JPPF v2.x, JPPF will use the server port defined with the "old" property "class.server.port" if "jppf.server.port" is not defined.

3 JMX management configuration

JPPF uses JMX to provide remote management capabilities for the nodes, and uses the JMXMP connector for communication.

The management features are enabled by default; this behavior can be changed by setting the following property:

# Enable or disable management of this node = true

When management is enabled, the following properties must be defined:

# JMX management host IP address. If not specified (recommended), the first non-local
# IP address (i.e. neither nor localhost) on this machine will be used.
# If no non-local IP is found, localhost will be used. = localhost

# JMX management port, used by the remote JMX connector = 11198

These properties have the same meaning and usage as for a server.

4 Recovery and failover

When the connection to a server is interrupted, the node will automatically attempt, for a given length of time, and at regular intervals, to reconnect to the same server. These properties are configured as follows, with their default values:

# number of seconds before the first reconnection attempt
reconnect.initial.delay = 1

# time after which the system stops trying to reconnect, in seconds
# a value of zero or less means it never stops
reconnect.max.time = 60

# time between two connection attempts, in seconds
reconnect.interval = 1

With these values, we have configured the recovery mechanism such that it will attempt to reconnect to the server after a 1 second delay, for 60 seconds and with connection attemps at 1 second intervals.

5 Interaction of failover and server discovery

When dicovery is enabled for the node (jppf.dicovery.enabled = true) and the maximum reconnection time is not infinite (reconnect.max.time = <strictly_positive_value>), a sophisticated failover mechanism takes place, following the sequence of steps below:

  • the node attempts to reconnect to the driver to which it was previously connected (or attempted to connect), during a maximum time specified by the configuration property "reconnect.max.time"
  • during this maximum time, it will make multiple attempts to connect to the same driver. This covers the case when the driver is restarted in the mean time.
  • after this maximum time has elapsed, it will attempt to auto-discover another driver, during a maximum time, specified via the configuration property "jppf.discovery.timeout" (in milliseconds)
  • if the node still fails to reconnect after this timeout has expired, it will fall back to the driver manually specified in the node's configuration file
  • the cycle starts again

6 Recovery from hardware failures

The mechanism to recover from hardwaire failure has its counterpart on each node, which works as follows:

  1. the node establishes a specific connection to the server, dedicated to failure detection
  2. at connection time, a handshake protocol takes place, where the node communicates a unique id (UUID) to the server
  3. the node will then attempt to get a message from the server (“check” message).
  4. if the message from the server is not received in a specified time frame, and this, a specified number of times in a row, the node will consider the connection to the server broken, will close it cleanly, and let the recovery and failover mechanism take over, as described in the previous section Interaction of failover and server discovery.

The following configuration properties are those required by the nodes' hardware failure recovery mechanism implemented by the server:

# Enable recovery from hardware failures on the node.
# Default value is false (disabled).
jppf.recovery.enabled = false

# Dedicated port number for the detection of node failure, must be the same as
# the value specified in the server configuration. Default value is 22222.
jppf.recovery.server.port = 22222
# Maximum number of attempts to get a message from the server before the
# connection is considered broken. Default value is 2.
jppf.recovery.max.retries = 2
# Maximum time in milliseconds allowed for each attempt to get a message
# from the server. Default value is 60000 (1 minute). = 60000

Note: if server discovery is active for a node, then the port number specified for the driver will override the one specified in the node's configuration.

7 Processing threads

A node can process multiple tasks concurrently, using a pool of threads. The size of this pool is configured as follows:

# number of threads running tasks in this node
processing.threads = 4

If this property is not defined, its value defaults to the number of processors or cores available to the JVM.

8 Node process configuration

In the same way as for a server (see Server process configuration), the node is made of 2 processes. In addition to the properties and environment inherited from the controller process, it is possible to specify other JVM options via the following configuration property:

jppf.jvm.options = -Xms64m -Xmx512m

As for the server, it is possible to specify additional class path elements through this property, by adding one or more “-cp” or “-classpath” options (unlike the Java command which only accepts one). For example:

jppf.jvm.options = -cp lib/myJar.jar -cp lib/OtherJar.jar -Xmx512m

9 Class loader cache

Each node creates a specific class loader for each new client whose tasks are executed in that node. The cache itself is managed as a bounded queue, and the oldest class loader will be evicted from the cache whenever the maximum size is reached. The evicted class loader then becomes unreachable and can be garbage collected. In most modern JDKs, this also results in the classes being unloaded.

If the class loader cache size is too large, this can lead to an out of memory condition in the node, especially in these 2 scenarios:

  • if too many classes are loaded, the space reserved to the class definitions (permanent generation in Oracle JDK) will fill up and cause an “OutOfMemoryError: PermGen space”
  • if the classes hold a large amount of static data (via static fields and static initializers), an “OutOfMemoryError: Heap Space” will be thrown

To mitigate this, the size of the class loader cache can be configured in the node as follows:

jppf.classloader.cache.size = 50

The default value for this property is 50, and the value must be at least equal to 1.

10 Class loader resources cache

To avoid uncessary network round trips, the node class loaders store locally the resources found in their extended classpath when one of their methods getResourceAsStream(), getResource(), getResources() or getMultipleResources() is called. The type of storage and location of the file-persisted cache can be configured as follows:

# type of storage: either “file” (the default) or “memory” = file
# root location of the file-persisted caches
jppf.resource.cache.dir = some_directory

When “file” persistence is configured, the node will fall back to memory persistence if the resource cannot be saved to the file system for any reason. This could happen, for instance, when the file system runs out of space.

For more details, please refer to the “Class Loading In JPPF > Local caching of network resources” section of this documentation.

11 Security policy

It is possible to limit what the nodes can do on the machine that hosts them. To this effect, they provide the means to restrict what permissions are granted to them on their host. These permissions are based on the Java security policy model. Discussing Java security is not in the scope of this document, but there is ample documentation about it in the JDK documentation.

To implement security, nodes require a security policy file. The syntax of this file is similar to that of Java security policy files, except that it only accepts permission entries (no grant or security context entries).

Some examples of permission entries:

// permission to read, write, delete node log file in current directory
permission "${user.dir}/jppf-node.log", "read,write,delete";
// permission to read all log4j system properties
permission java.util.PropertyPermission "log4j.*", "read";
// permission to connect to a MySQL database on the default port on localhost
permission "localhost:3306", "connect,listen";

To enable the security policy, the node configuration file must contain the following property definition:

# Path to the security file, relative to the current directory or classpath
jppf.policy.file = jppf.policy

When this property is not defined, or the policy file cannot be found, security is disabled.

The policy file does not have to be local to the node. If it is not present locally, the node will download it from the server. In this case it has to be locally accessible by the server, and the path to the policy file will be interpreted as path on the server's file system. This feature, combined with the ablity to remotely restart the nodes, allows to easily update and propagate changes to the security policy for all the nodes.

12 Full node configuration file (default values)

# Host name, or ip address, of the host the JPPF driver is running on = localhost

# JPPF server port number
jppf.server.port = 11111

# Enabling JMX features = true

# JMX management host IP address = localhost

# JMX management port = 12001

# path to the JPPF security policy file
#jppf.policy.file = config/jppf.policy

# Enable/Disable automatic discovery of JPPF drivers
jppf.discovery.enabled = true

# UDP multicast group to which drivers broadcast their connection parameters =

# UDP multicast port to which drivers broadcast their connection parameters
jppf.discovery.port = 11111

# How long the  node will attempt to automatically discover a driver before
# falling back to the parameters specified in this configuration file
jppf.discovery.timeout = 5000

# Automatic recovery: number of seconds before the first reconnection attempt
reconnect.initial.delay = 1

# Time after which the system stops trying to reconnect, in seconds
reconnect.max.time = 60

# Automatic recovery: time between two connection attempts, in seconds
reconnect.interval = 1

# Processing Threads: number of threads running tasks in this node
#processing.threads = 1

# Other JVM options added to the java command line when the node is started as
# a subprocess. Multiple options are separated by spaces
jppf.jvm.options = -server -Xmx256m

# size of the node's class loader cache
jppf.classloader.cache.size = 10
Main Page > Configuration guide > Node configuration

JPPF Copyright © 2005-2020 Powered by MediaWiki