Configuring
From JPPFWiki
| Main Page > Configuring |
JVM parameters
All JPPF components use a configuration file, so they know what to do at startup time. This configuration file is setup for the JVM through a system property, called jppf.config. As an example:
java -cp myclasspath -Djppf.config=jppf-config.properties mypackage.MyClass
The format of the JPPF configuration file is that of any properties file, and obeys the same syntactic rules. For the JPPF configuration to be found, it has to be accessible, using the path set in the jppf.config property, from either the file system or the JVM's classpath.
The lookup for the configuration file and the properties it defines is done in that order:
- search for a file with the specified path
- if no file is found, search for a resource in the classpath with the specified path
- if none of the above succceeds, use default values
Special note: this lookup mechanism makes it possible for the nodes to download their configuration file from the server. If the configuration file is not present in the file system or the local classpath of the node, it will be looked up using the network class loader and downloaded from the server, if it is in the server's local classpath. This has an important implication for the ease of deployment of new configurations, especially when the number of nodes is very large.
In addition to this, JPPF uses a logging tool from the Apache Foundation, Log4j. Log4j requires its own configuration file, setup through a system properties called log4j.configuration Our previous example would then become:
java -cp myclasspath -Djppf.config=jppf-config.properties -Dlog4j.configuration=log4j.properties mypackage.MyClass
Note: the list and use of the log4j properties are not discussed in this document.
Communication properties
In this section, we detail the properties that have to be defined so that clients, servers, nodes and tools can communicate with each other.
Since the communication between JPPF components is based on TCP sockets, we have to define TCP hosts and ports through the following properties:
- jppf.server.host: defines the name, or ip address, of the host the JPPF server (also called driver) is running on
Examples:
jppf.server.host = www.jppf.org jppf.server.host = 192.168.0.8
The default value for this property is localhost
- class.server.port: defines the port number used by the JPPF network class loader. This is the connection through which JPPF and application classes are loaded into the nodes.
Example:
class.server.port = 5011
The default value for this property is 11111
- app.server.port: defines the port number through which the clients and applications connect to the server, to submit tasks or send administration requests, then get the results back.
Example:
app.server.port = 5012
The default value for this property is 11112
- node.server.port: defines the port number through which the nodes receive tasks from the server, then send the results back.
Example:
node.server.port = 5013
The default value for this property is 11113
JMX Configuration
JPPF uses JMX to provide remote management capabilities for the servers and nodes, and uses the default RMI connector for communication. Each node or server has its own embedded RMI registry.
Note: The management host and port are propagated from the nodes to their attached server, and from the servers to the clients. You must ensure that they are in a form that makes the registry accessible, no matter where it is accessed from.
For instance: if you have a JPPF node on host1, and you specifiy the host as "localhost" in the node's configuration, then the monitoring features for the node will only be available locally. A JPPF driver or client running on another machine will not be able to connect to the node's management server, since the only address they will know is "localhost"
- jppf.management.host: defines the host name or IP address for the remote management and monitoring of the servers and nodes. It represents the host where an RMI registry is running.
Example:
jppf.management.host = 192.168.0.4
When this property is not defined explicitely, JPPF will automatically fetch the first non-local IP address (meaning not the loopback address) it can find on the current host. If none is found, localhost will be used.
This provides a way to use an identical configuration for all the nodes on a network.
- jppf.management.port: defines the port number for the remote management and monitoring of the servers and nodes. This port is used by an RMI registry where JMX MBeans are bound.
Example:
jppf.management.port = 1098
The default value for this property is 11198
- jppf.management.enabled: enables or disables the management and monitoring features for a node or a server. When disabled (value = false), this avoids the creation of an RMI registry and the eventual generation of error log messages by the RMI framework.
Example:
jppf.management.enabled = true
The default value for this property is true
Configuring the driver
In addition to the common communication properties described above, the JPPF server uses a number of additional parameters, to define the communication with peer servers, the maximum heap memory allocated to the driver, the settings for remote debugging, and the initial configuration of its auto-tuning algorithm.
Communicating with peer drivers
A JPPF driver can communicate with other drivers, which are then called peers. When this happens, the driver who initiates the communication with another will be seen by its peer as a node. Therefore, it will need to know the the host and port numbers of its peer for driver-to-node communication and network class loading, through a number of configuration properties, as shown in this example:
# space-separated list of peer names jppf.peers = driver1 # for each peer: # peer host name jppf.peer.driver1.server.host = 192.168.0.4 # port used for node communication node.peer.driver1.server.port = 11113 # port used by the network class loader class.peer.driver1.server.port = 11111
Driver heap memory
- max.memory.option: the maximum memory, expressed in megabytes, available to the driver JVM. Why is this not defined through a JVM -Xmxnnn option? That's because of the JPPF driver's design. The driver is actually implemented as two processes:
- the driver launcher, which launches the actual driver, and provides the ability to restart it, when a request is sent through the administration interface
- the driver itself
Here, the driver launcher uses the memory parameter to configure the JVM instance that runs the driver.
Example:
max.memory.option = 128
The default value for this property is 128
Remote debugging
- remote.debug.port: port on which the driver JVM will listen to remote debugging agents.
The default value for this property is 8000
- remote.debug.suspend: determines whether the JVM should wait for a remote agent to connect before starting.
The default value for this property is false
Example:
remote.debug.port = 8001 remote.debug.suspend = false
Automatic performance tuning
- task.bundle.strategy: this property determines whether the auto-tuning is turned off or on. The possible values are manual which disables auto-tuning, and autotuned which turns it on.
Example:
task.bundle.strategy = autotuned
The default value for this property is manual
- task.bundle.size: determines the size of the tasks bundles to be used. It is only used if the property task.bundle.strategy is set to manual.
Example:
task.bundle.strategy = manual task.bundle.size = 5
The default value for this property is 10
- task.bundle.autotuned.strategy: determines the tuning profile to use for automatic tuning. It is only used if the property task.bundle.strategy is set to autotuned. Currently, only one profile is available, but it can be modified via the administration interface.
Example:
task.bundle.strategy = autotuned task.bundle.autotuned.strategy = smooth
The default value for this property is smooth
Configuration of the tuning strategy
It is possible to configure any number of tuning strategies, by specifying the set of algorithm parameter values that define them.
Example: the "smooth" strategy:
# name of the strategy to use task.bundle.autotuned.strategy = smooth
# define the algorithm parameters strategy.smooth.minSamplesToAnalyse = 500 strategy.smooth.minSamplesToCheckConvergence = 300 strategy.smooth.maxDeviation = 0.2 strategy.smooth.maxGuessToStable = 10 strategy.smooth.sizeRatioDeviation = 1.5 strategy.smooth.decreaseRatio = 0.2
Full sample configuration file
# Host name, or ip address, of the host the JPPF driver is running on jppf.server.host = localhost # port number for the class server that performs remote class loading class.server.port = 11111 # port number the clients / applications connect to app.server.port = 11112 # port number the nodes connect to node.server.port = 11113 # enable the management and monitoring features for this server jppf.management.enabled = true # RMI port number for the JMX-based remote management of the server jppf.management.port = 11198 # space-separated list of peer names jppf.peers = driver1 # for each peer: # peer host name jppf.peer.driver1.server.host = 192.168.0.4 # port used for node communication node.peer.driver1.server.port = 11113 # port used by the network class loader class.peer.driver1.server.port = 11111 # Maximum memory, in megabytes, allocated to the JPPF driver max.memory.option = 128 # debugging options remote.debug.port = 8001 remote.debug.suspend = false #task.bundle.size = 5 #task.bundle.strategy = manual task.bundle.strategy = autotuned task.bundle.autotuned.strategy = smooth # define the algorithm parameters strategy.smooth.minSamplesToAnalyse = 500 strategy.smooth.minSamplesToCheckConvergence = 300 strategy.smooth.maxDeviation = 0.2 strategy.smooth.maxGuessToStable = 10 strategy.smooth.sizeRatioDeviation = 1.5 strategy.smooth.decreaseRatio = 0.2
Configuring a node
The configuration of a node addresses 4 majors areas of its activity: network communication, security, failure recovery and performance.
Network communication
A JPPF node requires 2 socket connections: one for receiving tasks bundles from the server, the other for the network classloader. Therefore, a node's communication parameters should be defined as in this example:
#host where the JPPF server is running jppf.server.host = 192.168.0.8 #port used by the network classloader to request and receive remote classes class.server.port = 11111 #port htrough which task bundles are received, and results sent node.server.port = 11113
In fact, a node uses multiple network class loaders, but they all share the same network connection. The advantages of this include greater scalability, less system resources consumption, as well as a more secure communication model.
Security
JPPF nodes can't just do whatever they want on the machine that hosts them. Therefore, they provide the means to restrict what permissions are granted to them on their host. These permissions are based on the Java security policy model. Discussing Java security is not in the scope of this document, but there is ample documentation about it in the JDK documentation.
To implement security, nodes require a security policy file. The syntax of this file is similar to that of Java security policy files, except that it only accepts permission entries (no security context entries). An example of permission entries:
// permission to read, write, delete node log file in current directory
permission java.io.FilePermission "${user.dir}/jppf-node.log", "read,write,delete";
// permission to read all log4j system properties
permission java.util.PropertyPermission "log4j.*", "read";
To enable the security policy, the node configuration file must contain the following property definition:
# Path to the security file, relative to the current directory or classpath jppf.policy.file = jppf.policy
When this property is not defined, security is disabled, even if there is a security policy file present.
The policy file does not have to be local to the node. If it is not present locally, the node will fetch it from the server. In this case it has to be locally accessible by the server.
This feature allows to easily update and propagate changes to the security policy for all the nodes.
Failover and recovery
The nodes have a built-in failover mechanism that makes them try to reconnect, when the connection with the server is severed. They will try to reconnect until the connection is established or until a timeout has expired. There are 3 configuration parameters that allow a finer control of how this is performed:
- reconnect.initial.delay: defines the time, in seconds, before the node starts trying to reconnect to the server.
Example:
# Try to reconnect after 5 seconds have passed reconnect.initial.delay = 5
The default value for this property is 1 second.
- reconnect.max.time: defines the amount of time, in seconds, after which the node will stop trying to reconnect with the server. When this timeout is reached, the node will shutdown gracefully with an error message. The time is counted after the initial delay specified in the property above.
Example:
# Try to reconnect during 15mn max, after that shutdown reconnect.max.time = 900
The default value for this property is 60 seconds.
- reconnect.interval: defines the time, in seconds, between two attempts at reconnecting with the server.
Example:
# Try to reconnect every 3 seconds after initial delay reconnect.interval = 3
The default value for this property is 1 second.
Node performance
The nodes execute their assigned tasks using a pool of worker threads. Depending on the node hardware and software environment, if can be benefitial, in terms of performance, to specify a number of worker threads greater than the default, which is one thread. There is one configuration property that allows it: processing.threads. Example:
# 2 worker threads in the pool processing.threads = 2
The default value for this property is 1 thread.
Overriding the server's performance tuning parameters
It is also possible to override the tuning strategy and parameters used by the driver, by configuring them on the node side. To set the parameters, please see Automatic performance tuning and Configuration of the tuning strategy. The settings on the node side will override those determined by the driver, even if they have been set through the administration console. The override is effective if you define, in the node's configuration properties file, the property task.bundle.strategy. To disable the override, simply comment it out.
Full sample configuration file
# Host name, or ip address, of the host the JPPF driver is running on jppf.server.host = localhost # port number for the class server that performs remote class loading class.server.port = 11111 # port number the nodes connect to node.server.port = 11113 # enable the management and monitoring features for this node jppf.management.enabled = true # RMI port number for the JMX-based remote management and monitoring of the node jppf.management.port = 12001 # path to the JPPF security policy file jppf.policy.file = jppf.policy # Automatic recovery: number of seconds before the first reconnection attempt reconnect.initial.delay = 1 # Automatic recovery: time after which the system stops trying to reconnect reconnect.max.time = 900 # Automatic recovery: time between two connection attempts, in seconds reconnect.interval = 1 # Processing Threads: number of threads running tasks in this node processing.threads = 1
Configuring a client or the administration tool
A JPPF client can connect to multiple servers. Each of these server connections is named and is assigned a priority. The server with the highest priority is the one the client will use to submit tasks or administration requests. If multiple servers have the same priority from the client's point of view, then they will be considered as a pool and tasks will be evenly distributed over the connections in the pool. The priority determines a fallback order for the client: if the connection with the highest priority fails for any reason, the client will fall back to the next connection with the highest prority.
A JPPF client requires 2 network connections with each server: one for submitting tasks and administration requests, and receiving responses, the other to provide class definitions (or other resources) when requested by the server classloader.
Examples:
# Space-separated names of the available driver connections jppf.drivers = driver1 driver2 # Host name, or ip address, of the host the JPPF driver is running on driver1.jppf.server.host = localhost # port number for the class server that performs remote class loading driver1.class.server.port = 11111 # port number the clients connect and submit requests to driver1.app.server.port = 11112 # RMI port number for the JMX-based remote management of the server driver1.jppf.management.port = 11198 # priority assigned to the server connection driver1.priority = 10 driver2.jppf.server.host = localhost driver2.class.server.port = 11121 driver2.app.server.port = 11122 driver2.jppf.management.port = 11199 driver2.priority = 10
Defines a pool of 2 server connections. The connections having the same priority, the load will be balanced over the connections in the pool.
jppf.drivers = driver1 driver2 driver1.jppf.server.host = localhost driver1.class.server.port = 11111 driver1.app.server.port = 11112 driver1.jppf.management.port = 11198 driver1.priority = 10 driver2.jppf.server.host = localhost driver2.class.server.port = 11121 driver2.app.server.port = 11122 driver2.jppf.management.port = 11199 driver2.priority = 5
This example defines a fallback strategy for the client: driver1 has a higher priority than driver2 and will be used as long as the connection is available. Should the connection fail, the client will switch to driver2 and resubmit the tasks pending from the connection to driver1.
In addition to that, it is possible to specifiy the properties for the recovery properties for these connections, in the same way as for the nodes:
# Automatic recovery: number of seconds before the first reconnection attempt reconnect.initial.delay = 1 # Automatic recovery: time after which the system stops trying to reconnect reconnect.max.time = 60 # Automatic recovery: time between two connection attempts, in seconds reconnect.interval = 1
Here is a full client configuration file:
# Space-separated names of the available driver connections jppf.drivers = driver1 driver2 driver3 # Host name, or ip address, of the host the JPPF driver is running on driver1.jppf.server.host = localhost # port number for the class server that performs remote class loading driver1.class.server.port = 11111 # port number the clients connect and submit requests to driver1.app.server.port = 11112 # RMI port number for the JMX-based remote management of the server driver1.jppf.management.port = 11198 # priority assigned to the server connection driver1.priority = 10 # Second connection in the pool driver2.jppf.server.host = localhost driver2.class.server.port = 11121 driver2.app.server.port = 11122 driver2.jppf.management.port = 11199 driver2.priority = 10 # Server to fall back to if all connections in the pool fail driver3.jppf.server.host = otherhost driver3.class.server.port = 11111 driver3.app.server.port = 11112 driver3.jppf.management.port = 12000 driver3.priority = 5 # Automatic recovery: number of seconds before the first reconnection attempt reconnect.initial.delay = 1 # Automatic recovery: time after which the system stops trying to reconnect reconnect.max.time = 60 # Automatic recovery: time between two connection attempts, in seconds reconnect.interval = 1
