Database services
From JPPF 6.3 Documentation
|
Main Page > Database services |
A great many JPPF-based applications have a need to access one or more databases in the computations they submit to the grid. While it is quite easy to define and use database connectivity from within JPPF tasks, we have found that it can also be a source of headaches when addressing issues such as:
- caching and maintaining data sources reusable accross jobs submissions
- handling the availability of the JDBC driver and data source implementation classes
- avoiding the configuration and deployment burden which occurs when database access is needed on a large number of nodes in the grid
To address these issues, JPPF provides a database services component which includes:
- a facility to easily define and configure JDBC data sources from the JPPF configuration (static definition)
- a data source registry and factory to lookup and manipulate data sources, as well as create new ones dynamically.
- a set of functionalities to define data sources in a JPPF driver and propagate them automatically to the nodes
The data source and connection pooling implementations are provided by the HikariCP library. Consequently, the names of the corresponding configuration properties are those provided in the HikariCP documentation.
1 Configuring JDBC data sources
1.1 Data source definition
A data source can be defined in the JPPF configuration with a set of properties in the following format:
jppf.datasource.<config_id>.name = <datasource_name> jppf.datasource.<config_id>.<property1> = <value1> ... jppf.datasource.<config_id>.<propertyN> = <valueN>
where:
- <config_id> is an arbitrary string used to identify which properties are used to define the data source and to distinguish these properties from those of other data source defintions
- <datasource_name> is a user-defined name that will be used to retrieve the data source via API
- <propertyi> is the name of a HikariCP-supported property
Here is a concrete example, defining two data sources named "test1DS" and "test2DS", connecting to two distinct MySQL databases and specifying different connection pool settings:
# definition of data source "test1DS" jppf.datasource.test1.name = test1DS jppf.datasource.test1.driverClassName = com.mysql.jdbc.Driver jppf.datasource.test1.jdbcUrl = jdbc:mysql://192.168.1.12:3306/test1_db jppf.datasource.test1.username = testjppf jppf.datasource.test1.password = mypassword jppf.datasource.test1.minimumIdle = 5 jppf.datasource.test1.maximumPoolSize = 10 jppf.datasource.test1.connectionTimeout = 30000 jppf.datasource.test1.idleTimeout = 600000 # definition of data source "test2DS" jppf.datasource.test2.name = test2DS jppf.datasource.test2.driverClassName = com.mysql.jdbc.Driver jppf.datasource.test2.jdbcUrl = jdbc:mysql://192.168.1.24:3306/test2_db jppf.datasource.test2.username = testjppf jppf.datasource.test2.password = mypassword jppf.datasource.test2.minimumIdle = 1 jppf.datasource.test2.maximumPoolSize = 20 jppf.datasource.test2.connectionTimeout = 30000 jppf.datasource.test2.idleTimeout = 3000000
1.2 Scoped data sources and propagation to the nodes
When a data source is defined in a JPPF driver configuration, it is possible to specify that its definition should be propagated to the nodes connected to this driver, including new nodes that connect well after the driver is started. To achieve this, you can add a property named "scope" to the data source definition, which has three possible values:
- "local" means the data source definition is intended for the current local JVM only
- "remote" means the definition is to be sent over to the nodes but not used locally
- "any" means the defintion will be used both locally by the driver and in the nodes
When the scope is unspecified, it always defaults to "local". Here is an example scoped data source definition:
jppf.datasource.node.name = nodeDS # "nodeDS" data source available in the nodes only jppf.datasource.node.scope = remote jppf.datasource.node.driverClassName = com.mysql.jdbc.Driver jppf.datasource.node.jdbcUrl = jdbc:mysql://192.168.1.12:3306/test1_db jppf.datasource.node.username = testjppf jppf.datasource.node.password = mypassword jppf.datasource.common.name = commonDS # "commonDS" data source available in the driver and in the nodes jppf.datasource.common.scope = any jppf.datasource.common.driverClassName = com.mysql.jdbc.Driver jppf.datasource.common.jdbcUrl = jdbc:mysql://192.168.1.12:3306/test1_db jppf.datasource.common.username = testjppf jppf.datasource.common.password = mypassword
1.3 Data source execution policy filter
When the configuration of a driver specifies data sources intended for the nodes, with a scope of either "remote" or "any", it is also possible to specify which nodes the data sources are propagated to, by using an execution policy in XML format.
The execution policy is specified using a configuration property named "policy", in this format:
jppf.datasource.<config_id>.policy = xml_source_type | xml_source
The xml_source_type part of the value specifies where to read the XML policy from, and the meaning of xml_source depends on its value. The value of xml_source_type can be one of:
- "inline": xml_source is the actual XML policy specified inline in the configuration
- "file": xml_source represents a path, in either the file system or classpath, to an XML file or resource. The path is looked up first in the file system, then in the classpath if it is not present in the file system
- "url": xml_source represents a URL to an XML file, including but not limited to, http, https, ftp and file urls.
Note that xml_source_type can be omitted, in which case it defaults to "inline".
Here is an example specifying an inline policy:
jppf.datasource.node1.name = nodeDS1 jppf.datasource.node1.scope = remote # ... other properties ... jppf.datasource.node1.policy = inline | \ <jppf:ExecutionPolicy> \ <OR> \ <Equal valueType="string" ignoreCase="false"> \ <Property>custom.prop</Property> \ <Value>node1</Value> \ </Equal> \ <IsInIPv4Subnet> \ <Subnet>192.168.1.10-50</Subnet> \ <IsInIPv4Subnet> \ </OR> \ </jppf:ExecutionPolicy>
String xml = new Equal("custom.prop", false, "node1").or(new IsInIPv4Subnet("192.168.1.10-50")).toXML();
This is much less cumbersome and can be reused by scripting the value of the "policy" property:
# scripting the value with the default Javascript engine jppf.datasource.node1.policy = inline | $script{ \ new org.jppf.node.policy.Equal("custom.prop", false, "node1") \ .or(new org.jppf.node.policy.IsInIPv4Subnet("192.168.1.10-50")).toXML(); \ }$
An example using a path to an XML file:
jppf.datasource.node2.name = nodeDS2 jppf.datasource.node2.scope = remote # ... other properties ... jppf.datasource.node2.policy = file | /home/some_user/jppf/my_policy.xml
Using the same XML file specified as a file: url:
jppf.datasource.node3.name = nodeDS3 jppf.datasource.node3.scope = remote # ... other properties ... jppf.datasource.node3.policy = url | file:///home/some_user/jppf/my_policy.xml
1.4 Organizing and maintaining data source definitions
As we have seen in previous sections, it is possible to configure data sources with a lot of flexibility. However, especially when you have to deal with multiple data sources, this may tend to bloat the configuration file, introduce duplication of values for the properties, and cause an additional maintenance burden. We strongly encourage you to use the features described in the JPPF configuration guide: includes, substitutions and scripted values.
For instance, let's imagine a scenario where we define two data sources using distinct databases on the same MySQL server. The two data sources will share the same JDBC driver and server URL, which we can put in separate properties referenced in the data source definitions. Furthermore, to avoid bloating the driver's configuration file, we will put these definitions in a separate file in "config/datasource.properties". Here's what the content of "datasource.properties" would look like:
# variables referenced in the data source defintions mysql.driver = com.mysql.jdbc.Driver mysql.url.base = jdbc:mysql://192.168.1.12:3306 my.scope = remote # data source "DS1" jppf.datasource.ds1.name = DS1 jppf.datasource.ds1.scope = ${my.scope} jppf.datasource.ds1.driverClassName = ${mysql.driver} jppf.datasource.ds1.jdbcUrl = ${mysql.url.base}/test_db1 jppf.datasource.ds1.username = user1 jppf.datasource.ds1.password = password1 # data source "DS2" jppf.datasource.ds2.name = DS2 jppf.datasource.ds2.scope = ${my.scope} jppf.datasource.ds2.driverClassName = ${mysql.driver} jppf.datasource.ds2.jdbcUrl = ${mysql.url.base}/test_db2 jppf.datasource.ds2.username = user2 jppf.datasource.ds2.password = password2
Then, in the main driver configuration file, we would just include "datasource.properties" like this:
# include the data source definitions file #!include file config/datasource.properties
We have also seen that we can use scripted property values. This can be especially useful if you do not want to leave clear-text passwords in your configuration files. You can then use a script to invoke an API that will decrypt an encrypted password provided in the configuration. For instance, let's say we define the following method in the class test.Crypto:
public class Crypto { // decrypt the specified string public static String decrypt(String encrypted) { return ...; } }
We can use it directly in a scripted value for a password:
# use an encrypted password in the configuration jppf.datasource.ds1.password = $script{ test.Crypto.decrypt("Grt!HY+Bp"); }$
2 The JPPFDatasourceFactory API
To access data sources defined in the configuration, use the JPPFDatasourceFactory class, defined as follows:
public final class JPPFDatasourceFactory { // Get a data source factory. This method always returns the same instance public synchronized static JPPFDatasourceFactory getInstance() // Get the data source with the specified name public DataSource getDataSource(String name) // Get the names of all currently defined data sources public List<String> getDataSourceNames() // Create a data source from the specified configuration properties // This method assumes the property names have no prefix public DataSource createDataSource(String name, Properties props) // Create one or more data sources from the specified configuration properties. // The property names are assumed to be prefixed as in static data source configs public Map<String, DataSource> createDataSources(Properties props) // Remove the data source with the specified name; also close it and release the // resources it is using public boolean removeDataSource(String name) // Close and remove all the data sources in this registry public void clear() }
This class is defined as a JVM-wide singleton, therefore you first need to obtain the singleton instance before using it. For example, to get a defined data source named "DS1", you would do like this:
DataSource ds1 = JPPFDatasourceFactory.getInstance().getDataSource("DS1");
The methods in JPPFDatasourceFactory are thread-safe and can be used from multiple threads concurrently, without a need for external synchronization.
2.1 Looking up and exploring data sources
Use the getDataSource(String name) method to lookup an already defined data source, for instance:
DataSource myDS = JPPFDatasourceFactory.getInstance().getDataSource("myDS");
To explore the existing data sources, use the getDataSourceNames() method, as in this example:
JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance(); for (String dsName: factory.getDataSourceNames()) { DataSource datasource = factory.getDataSource(dsName); // do something with the data source ... }
2.2 Creating data sources dynamically
JPPFDatasourceFactory has two methods that allow you to create one or more data sources:
createDataSource(String name, Properties props)
Creates a single data source with the specified name and attributesd provided in the properties. The names of the properties are not prefixed, meaning that instead of jppf.datasource.<config_id>.<property_name> you just use <property_name>. Here is an example using the TypedProperties class for the properties:
TypedProperties props = new TypedProperties() .setString("driverClassName", "com.mysql.jdbc.Driver") .setString("jdbcUrl", "jdbc:mysql://localhost:3306/testjppf") .setString("username", "testjppf") .setString("password", "testjppf") .setInt("maximumPoolSize", 10); JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance(); DataSource ds1 = factory.createDataSource("DS1", props);
createDataSources(Properties props)
This method enables the creation of one or more data sources at once, using properties with prefixed names. It returns a map associating the created data sources names with the corresponding DataSource objects. The name of each data source is specified as a prefixed "name" property in the form jppf.datasource.<config_id>.name = <data_source_name>. Here is an example defining two data sources named "DS1" and "DS2":
TypedProperties props = new TypedProperties(); // property name prefix for "config1" config id String prefix1 = "jppf.datasource.config1."; props.setString(prefix1 + "name", "DS1") .setString(prefix1 + "driverClassName", "com.mysql.jdbc.Driver") // ... other properties ... .setInt(prefix1 + "maximumPoolSize", 10); // property name prefix for "config2" config id String prefix2 = "jppf.datasource.config2."; props.setString(prefix2 + "name", "DS2") .setString(prefix2 + "driverClassName", "com.mysql.jdbc.Driver") // ... other properties ... .setInt(prefix2 + "maximumPoolSize", 10); JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance(); Map<String, DataSource> map = factory.createDataSources(props);
2.3 Data source removal and cleanup
The removeDataSource(String name) method removes the data source with the specified name and releases any resource it is using. It returns a boolean value which indicates wheather the removal operation succeeded. Example:
TypedProperties props = ...; JPPFDatasourceFactory factory = JPPFDatasourceFactory.getInstance(); DataSource ds1 = factory.createDataSource("myDS", props); // ... do something with the data source, then remove it ... if (!factory.removeDataSource("myDS")) { System.out.println("could not remove data source 'myDS', please look at the logs"); }
Similarly, the clear() method removes all data sources defined in the JPPFDatasourceFactory instance.
3 Class loading and classpath considerations
3.1 HikariCP and JDBC driver libraries
As a general rule, the HikariCP and JDBC driver libraries should always be co-located in the classpath. This means that, if you have one in a JPPF component's classpath, then the other should be in the same component's classpath. For instance, the JPPF driver distribution includes the HikariCP jar file in its '/lib' folder. This is also where the JDBC driver library should be. As per the JPPF distributed class loading mechanism, this will automatically make these libraries available to any standard JPPF node that connects to the driver, and the classes in these libraries will be automatically downloaded when needed by the node.
Following this rule avoids many potential class loading issues and ClassCastException errors due to classes being loaded by multiple class loaders.
3.2 Default classpath
By default, the JPPF driver comes with the HikariCP library in its '/lib' folder. This also is where the JDBC drivers for all the databases you use should go. If you use only standard nodes with remote class loading enabled (the default) then it is sufficient to make these libraries available to the nodes.
If you do no wish to deploy the JDBC driver library in the 'lib' folder, the preferred way to specify its location by is to add one or more -cp <driver_path> arguments to the driver's jppf.jvm.options configuration property, for example:
jppf.jvm.options = -server -Xmx1g \ -cp /opt/MySQL/mysql-connector-java-5.1.10/mysql-connector-java-5.1.10-bin.jar
3.3 Offline nodes
In an offline node, the remote class loading is completely disabled. Consequently, the HikaryCP and JDBC driver libraries must be both explicitely deployed to the node's local classpath. This can be done using the same techniques as described in the above section.
The HikariCP library can be found either:
- in the driver distribution: JPPF-x.y.z-driver/lib/HikariCP-java7-2.4.11.jar
- or in the source distribution: JPPF-x.y.z-full-src/JPPF/lib/HikariCP/HikariCP-java7-2.4.11.jar
3.4 JPPF client applications
A JPPF client will automatically create data sources defined in its configuration. Data sources can also be explicitely added, using the JPPFDatasourceFactory.createDataSources() method, as illustrated in the creating data sources dynamically section.
When multiple JPPFClient instances coexist in the same JVM, care must be taken as to the naming of the datasources. If a datasource name is found multiple times, only the first defintion will be taken into account. Subsequent defintions will be ignored.
As for offline nodes, the HikaryCP and JDBC driver libraries must be explicitely added to the client JVM's classpath.
Main Page > Database services |