JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 

Database services

From JPPF 6.0 Documentation

Jump to: navigation, search

Contents

Main Page > Database services


A great many JPPF-based applications have a need to access one or more databases in the computations they submit to the grid. While it is quite easy to define and use database connectivity from within JPPF tasks, we have found that it can also be a source of headaches when addressing issues such as:

  • caching and maintaining data sources reusable accross jobs submissions
  • handling the availability of the JDBC driver and data source implementation classes
  • avoiding the configuration and deployment burden which occurs when database access is needed on a large number of nodes in the grid

To address these issues, JPPF provides a database services component which includes:

  • a facility to easily define and configure JDBC data sources from the JPPF configuration (static definition)
  • a data source registry and factory to lookup and manipulate data sources, as well as create new ones dynamically.
  • a set of functionalities to define data sources in a JPPF driver and propagate them automatically to the nodes

The data source and connection pooling implementations are provided by the HikariCP library. Consequently, the names of the corresponding configuration properties are those provided in the HikariCP documentation.

1 Configuring JDBC data sources

1.1 Data source definition

A data source can be defined in the JPPF configuration with a set of properties in the following format:

jppf.datasource.<config_id>.name = <datasource_name>
jppf.datasource.<config_id>.<property1> = <value1>
...
jppf.datasource.<config_id>.<propertyN> = <valueN>

where:

  • <config_id> is an arbitrary string used to identify which properties are used to define the data source and to distinguish these properties from those of other data source defintions
  • <datasource_name> is a user-defined name that will be used to retrieve the data source via API
  • <propertyi> is the name of a HikariCP-supported property

Here is a concrete example, defining two data sources named "test1DS" and "test2DS", connecting to two distinct MySQL databases and specifying different connection pool settings:

# definition of data source "test1DS"
jppf.datasource.test1.name = test1DS
jppf.datasource.test1.driverClassName = com.mysql.jdbc.Driver
jppf.datasource.test1.jdbcUrl = jdbc:mysql://192.168.1.12:3306/test1_db
jppf.datasource.test1.username = testjppf
jppf.datasource.test1.password = mypassword
jppf.datasource.test1.minimumIdle = 5
jppf.datasource.test1.maximumPoolSize = 10
jppf.datasource.test1.connectionTimeout = 30000
jppf.datasource.test1.idleTimeout = 600000

# definition of data source "test2DS"
jppf.datasource.test2.name = test2DS
jppf.datasource.test2.driverClassName = com.mysql.jdbc.Driver
jppf.datasource.test2.jdbcUrl = jdbc:mysql://192.168.1.24:3306/test2_db
jppf.datasource.test2.username = testjppf
jppf.datasource.test2.password = mypassword
jppf.datasource.test2.minimumIdle = 1
jppf.datasource.test2.maximumPoolSize = 20
jppf.datasource.test2.connectionTimeout = 30000
jppf.datasource.test2.idleTimeout = 3000000
TIP: to disable a data source definition, simply remove or comment out the "name" property by placing a '#' character at the beginning of the line.

1.2 Scoped data sources and propagation to the nodes

When a data source is defined in a JPPF driver configuration, it is possible to specify that its definition should be propagated to the nodes connected to this driver, including new nodes that connect well after the driver is started. To achieve this, you can add a property named "scope" to the data source definition, which has three possible values:

  • "local" means the data source definition is intended for the current local JVM only
  • "remote" means the definition is to be sent over to the nodes but not used locally
  • "any" means the defintion will be used both locally by the driver and in the nodes

When the scope is unspecified, it always defaults to "local". Here is an example scoped data source definition:

jppf.datasource.node.name = nodeDS
# "nodeDS" data source available in the nodes only
jppf.datasource.node.scope = remote
jppf.datasource.node.driverClassName = com.mysql.jdbc.Driver
jppf.datasource.node.jdbcUrl = jdbc:mysql://192.168.1.12:3306/test1_db
jppf.datasource.node.username = testjppf
jppf.datasource.node.password = mypassword

jppf.datasource.common.name = commonDS
# "commonDS" data source available in the driver and in the nodes
jppf.datasource.common.scope = any
jppf.datasource.common.driverClassName = com.mysql.jdbc.Driver
jppf.datasource.common.jdbcUrl = jdbc:mysql://192.168.1.12:3306/test1_db
jppf.datasource.common.username = testjppf
jppf.datasource.common.password = mypassword
Note: the value of the "scope" property is case-insensitive.

1.3 Data source execution policy filter

When the configuration of a driver specifies data sources intended for the nodes, with a scope of either "remote" or "any", it is also possible to specify which nodes the data sources are propagated to, by using an execution policy in XML format.

The execution policy is specified using a configuration property named "policy", in this format:

jppf.datasource.<config_id>.policy = xml_source_type |  xml_source

The xml_source_type part of the value specifies where to read the XML policy from, and the meaning of xml_source depends on its value. The value of xml_source_type can be one of:

  • "inline": xml_source is the actual XML policy specified inline in the configuration
  • "file": xml_source represents a path, in either the file system or classpath, to an XML file or resource. The path is looked up first in the file system, then in the classpath if it is not present in the file system
  • "url": xml_source represents a URL to an XML file, including but not limited to, http, https, ftp and file urls.

Note that xml_source_type can be omitted, in which case it defaults to "inline".

Here is an example specifying an inline policy:

jppf.datasource.node1.name = nodeDS1
jppf.datasource.node1.scope = remote
# ... other properties ...
jppf.datasource.node1.policy = inline | \
  <jppf:ExecutionPolicy> \
    <OR> \
      <Equal valueType="string" ignoreCase="false"> \
        <Property>custom.prop</Property> \
        <Value>node1</Value> \
      </Equal> \
      <IsInIPv4Subnet> \
        <Subnet>192.168.1.10-50</Subnet> \
      <IsInIPv4Subnet> \
    </OR> \
  </jppf:ExecutionPolicy>
TIP: the XML is equivalent to the string produced by invoking the following ExecutionPolicy API:
String xml = new Equal("custom.prop", false, "node1")
  .or(new IsInIPv4Subnet("192.168.1.10-50")).toXML();

This is much less cumbersome and can be reused by scripting the value of the "policy" property:

# scripting the value with the default Javascript engine
jppf.datasource.node1.policy = inline | $script{ \
  importPackage(Packages.org.jppf.node.policy); \
  new Equal("custom.prop", false, "node1") \
    .or(new IsInIPv4Subnet("192.168.1.10-50")).toXML(); \
}$


An example using a path to an XML file:

jppf.datasource.node2.name = nodeDS2
jppf.datasource.node2.scope = remote
# ... other properties ...
jppf.datasource.node2.policy = file | /home/some_user/jppf/my_policy.xml

Using the same XML file specified as a file: url:

jppf.datasource.node3.name = nodeDS3
jppf.datasource.node3.scope = remote
# ... other properties ...
jppf.datasource.node3.policy = url | file:///home/some_user/jppf/my_policy.xml

1.4 Organizing and maintaining data source definitions

As we have seen in previous sections, it is possible to configure data sources with a lot of flexibility. However, especially when you have to deal with multiple data sources, this may tend to bloat the configuration file, introduce duplication of values for the properties, and cause an additional maintenance burden. We strongly encourage you to use the features described in the JPPF configuration guide: includes, substitutions and scripted values.

For instance, let's imagine a scenario where we define two data sources using distinct databases on the same MySQL server. The two data sources will share the same JDBC driver and server URL, which we can put in separate properties referenced in the data source definitions. Furthermore, to avoid bloating the driver's configuration file, we will put these definitions in a separate file in "config/datasource.properties". Here's what the content of "datasource.properties" would look like:

# variables referenced in the data source defintions
mysql.driver = com.mysql.jdbc.Driver
mysql.url.base = jdbc:mysql://192.168.1.12:3306
my.scope = remote

# data source "DS1"
jppf.datasource.ds1.name = DS1
jppf.datasource.ds1.scope = ${my.scope}
jppf.datasource.ds1.driverClassName = ${mysql.driver}
jppf.datasource.ds1.jdbcUrl = ${mysql.url.base}/test_db1
jppf.datasource.ds1.username = user1
jppf.datasource.ds1.password = password1

# data source "DS2"
jppf.datasource.ds2.name = DS2
jppf.datasource.ds2.scope = ${my.scope}
jppf.datasource.ds2.driverClassName = ${mysql.driver}
jppf.datasource.ds2.jdbcUrl = ${mysql.url.base}/test_db2
jppf.datasource.ds2.username = user2
jppf.datasource.ds2.password = password2

Then, in the main driver configuration file, we would just include "datasource.properties" like this:

# include the data source definitions file
#!include file config/datasource.properties

We have also seen that we can use scripted property values. This can be especially useful if you do not want to leave clear-text passwords in your configuration files. You can then use a script to invoke an API that will decrypt an encrypted password provided in the configuration. For instance, let's say we define the following method in the class test.Crypto:

public class Crypto {
  // decrypt the specified string
  public static String decrypt(String encrypted) {
    return ...;
  }
}

We can use it directly in a scripted value for a password:

# use an encrypted password in the configuration
jppf.datasource.ds1.password = $script{ test.Crypto.decrypt("Grt!HY+Bp"); }$

2 The JPPFDatasourceFactory API

To access data sources defined in the configuration, use the JPPFDatasourceFactory class, defined as follows:

public final class JPPFDatasourceFactory {
  // Get a data source factory. This method always returns the same instance
  public synchronized static JPPFDatasourceFactory getInstance()

  // Get the data source with the specified name
  public DataSource getDataSource(String name)

  // Get the names of all currently defined data sources
  public List<String> getDataSourceNames()

  // Create a data source from the specified configuration properties
  // This method assumes the property names have no prefix
  public DataSource createDataSource(String name, Properties props)

  // Create one or more data sources from the specified configuration properties.
  // The property names are assumed to be prefixed as in static data source configs
  public Map<String, DataSource> createDataSources(Properties props)

  // Remove the data source with the specified name; also close it and release the
  // resources it is using
  public boolean removeDataSource(String name)

  // Close and remove all the data sources in this registry
  public void clear()
}

This class is defined as a JVM-wide singleton, therefore you first need to obtain the singleton instance before using it. For example, to get a defined data source named "DS1", you would do like this:

DataSource ds1 = JPPFDatasourceFactory.getInstance().getDataSource("DS1");

The methods in JPPFDatasourceFactory are thread-safe and can be used from multiple threads concurrently, without a need for external synchronization.

2.1 Looking up and exploring data sources

Use the getDataSource(String name) method to lookup an already defined data source, for instance:

DataSource myDS = JPPFDatasourceFactory.getInstance().getDataSource("myDS");

To explore the existing data sources, use the getDataSourceNames() method, as in this example:

JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance();
for (String dsName: factory.getDataSourceNames()) {
  DataSource datasource = factory.getDataSource(dsName);
  //  do something with the data source ...
}
Note: getDataSource() and getDataSourceNames() only apply to locally-scoped data source definitions. Data sources defined with a "remote" scope will not be retrieved by these methods.

2.2 Creating data sources dynamically

JPPFDatasourceFactory has two methods that allow you to create one or more data sources:

createDataSource(String name, Properties props)

Creates a single data source with the specified name and attributesd provided in the properties. The names of the properties are not prefixed, meaning that instead of jppf.datasource.<config_id>.<property_name> you just use <property_name>. Here is an example using the TypedProperties class for the properties:

TypedProperties props = new TypedProperties()
  .setString("driverClassName", "com.mysql.jdbc.Driver")
  .setString("jdbcUrl", "jdbc:mysql://localhost:3306/testjppf")
  .setString("username", "testjppf")
  .setString("password", "testjppf")
  .setInt("maximumPoolSize", 10);

JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance();
DataSource ds1 = factory.createDataSource("DS1", props);

createDataSources(Properties props)

This method enables the creation of one or more data sources at once, using properties with prefixed names. It returns a map associating the created data sources names with the corresponding DataSource objects. The name of each data source is specified as a prefixed "name" property in the form jppf.datasource.<config_id>.name = <data_source_name>. Here is an example defining two data sources named "DS1" and "DS2":

TypedProperties props = new TypedProperties();

// property name prefix for "config1" config id
String prefix1 = "jppf.datasource.config1.";
props.setString(prefix1 + "name", "DS1")
  .setString(prefix1 + "driverClassName", "com.mysql.jdbc.Driver")
  // ... other properties ...
  .setInt(prefix1 + "maximumPoolSize", 10);

// property name prefix for "config2" config id
String prefix2 = "jppf.datasource.config2.";
props.setString(prefix2 + "name", "DS2")
  .setString(prefix2 + "driverClassName", "com.mysql.jdbc.Driver")
  // ... other properties ...
  .setInt(prefix2 + "maximumPoolSize", 10);

JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance();
Map<String, DataSource> map = factory.createDataSources(props);
Note: the createXXX() methods can only be used to create locally-scoped data sources. As a consequence, the "scope" property does not apply to dynamically created data sources and should not be specified. If you specify a "remote" scope, then the data source definition will be ignored.

2.3 Data source removal and cleanup

The removeDataSource(String name) method removes the data source with the specified name and releases any resource it is using. It returns a boolean value which indicates wheather the removal operation succeeded. Example:

TypedProperties props = ...;
JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance();
DataSource ds1 = factory.createDataSource("myDS", props);
// ... do something with the data source, then remove it ...
if (!factory.removeDataSource("myDS")) {
  System.out.println("could not remove data source 'myDS', please look at the logs");
}

Similarly, the clear() method removes all data sources defined in the JPPFDatasourceFactory instance.

3 Class loading and classpath considerations

3.1 HikariCP and JDBC driver libraries

As a general rule, the HikariCP and JDBC driver libraries should always be co-located in the classpath. This means that, if you have one in a JPPF component's classpath, then the other should be in the same component's classpath. For instance, the JPPF driver distribution includes the HikariCP jar file in its '/lib' folder. This is also where the JDBC driver library should be. As per the JPPF distributed class loading mechanism, this will automatically make these libraries available to any standard JPPF node that connects to the driver, and the classes in these libraries will be automatically downloaded when needed by the node.

Following this rule avoids many potential class loading issues and ClassCastException errors due to classes being loaded by multiple class loaders.

3.2 Default classpath

By default, the JPPF driver comes with the HikariCP library in its '/lib' folder. This also is where the JDBC drivers for all the databases you use should go. If you use only standard nodes with remote class loading enabled (the default) then it is sufficient to make these libraries available to the nodes.

If you do no wish to deploy the JDBC driver library in the 'lib' folder, the preferred way to specify its location by is to add one or more -cp <driver_path> arguments to the driver's jppf.jvm.options configuration property, for example:

jppf.jvm.options = -server -Xmx1g \
  -cp /opt/MySQL/mysql-connector-java-5.1.10/mysql-connector-java-5.1.10-bin.jar

3.3 Offline nodes

In an offline node, the remote class loading is completely disabled. Consequently, the HikaryCP and JDBC driver libraries must be both explicitely deployed to the node's local classpath. This can be done using the same techniques as described in the above section.

The HikariCP library can be found either:

  • in the driver distribution: JPPF-x.y.z-driver/lib/HikariCP-java7-2.4.11.jar
  • or in the source distribution: JPPF-x.y.z-full-src/JPPF/lib/HikariCP/HikariCP-java7-2.4.11.jar

3.4 JPPF client applications

A JPPF client will automatically create data sources defined in its configuration. Data sources can also be explicitely added, using the JPPFDatasourceFactory.createDataSources() method, as illustrated in the creating data sources dynamically section.

When multiple JPPFClient instances coexist in the same JVM, care must be taken as to the naming of the datasources. If a datasource name is found multiple times, only the first defintion will be taken into account. Subsequent defintions will be ignored.

As for offline nodes, the HikaryCP and JDBC driver libraries must be explicitely added to the client JVM's classpath.

Main Page > Database services



JPPF Copyright © 2005-2020 JPPF.org Powered by MediaWiki