Node.js API#

The NDB API for use in Node.js applications is called Database Jones. It is a direct NDB API that can be used to develop real-time Node.js applications on top of NDB.

Node.js is a server-side implementation of the JavaScript language. JavaScript was developed to handle interactive web pages. Since so many people learned this language, it is now also used in web server development.

Database Jones is aimed at supporting web server applications that want to provide a scalable RonDB solution.

With Database Jones it is possible to have hundreds of web servers connected to the same cluster, using Global Replication it is possible to scale this even further by having many read clusters.

Given that NDB has very good write scalability it would be particularly interesting for applications that puts a heavy write load on the database.

With this in mind a sample application was developed that mimic a simple tweet application. The sample application is a very simple web server that accepts simple HTTP requests and shows how to program such an application towards NDB using Database Jones, the NDB adapter for Node.js.

Database Jones is developed in a separate git tree, this tree is named mysql/mysql-js.

This tree contains the NDB adapter called ndb. It also contains an extension of the mysql adapter. The description of the API can be used both towards the mysql adapter and towards the ndb adapter. The main interest in this book is the ndb adapter that connects directly to NDB using the C++ NDB API. Node.js can map C++ calls to Node.js calls in a layer similar to the JNI layer for Java.

In this chapter we will go through how to install Database Jones, how to get the sample application up and running and go through some very basic use cases for the API.

Programming in Node.js is asynchronous, this works very well with NDB since NDB is designed as asynchronous engine beneath the hood. At the same time it is a bit more complex to reason about asynchronous programs, it can take some time to get used to for someone that is used to sequential programming.

Installing Database Jones#

We will describe here how to install the Node.js NDB adapter from github.com where the development tree resides.

To install from github.com we need to install git at first. We also need an installation of RonDB. What we need here is the NDB API library that is used by ndb adapter.

Next step is to install Node.js. On my Mac OS X I simply wrote brew install node to do this. It should be easy enough to find out how to do this on your machine and operating system. Database Jones is working on Linux and Mac OS X primarily.

Now we have the npm program that we can use to install the mysql adapter if we want that one installed. It is not necessary since all API operations can be performed using the ndb adapter. The ndb adapter cannot be used to create, drop and alter tables.

To install the mysql adapter use the following command:

npm install mysql

Preparing to install the ndb adapter#

To install the ndb adapter we need to compile the C++ code that maps Node.js calls to C++ NDB API calls. This is done using a Node.js module called node-gyp. To install this use the command:

npm install -g node-gyp

node-gyp requires Python 2.7. On Linux we need a make tool and a compiler such as gcc.

On Mac OS X we need to install the XCode environment, this contains both a compiler and make tools.

Installing ndb adapter#

We start by preparing git to use symlinks and next we download mysql/mysql-js from github.com and prepare to install the ndb apapter using the following command.

git config --global --add core.symlinks true
git clone http://github.com/mysql/mysql-js
cd mysql-js/jones-ndb

At first we need to configure the ndb adapter using the command:

node configure

During execution of this command we need to point to the RonDB installation such that we find the NDB API library.

Now we are ready to build the ndb adapter using the command:

node-gyp configure build -d

It is normal to get a warning that $MYSQL_INSTALL_DIR/lib/mysql don’t exist. In newer versions of MySQL this no longer exists.

The Tweet sample application#

With Database Jones installed we can now make some experiments in how to use it. Node.js is a language developed for web pages that have been developed to also be used in web servers. A natural example application is a tweet application.

The sample application is available in the directory samples/tweet directory.

For simplicity we will start up a cluster using the MySQL test tool MTR. Go to your installed RonDB installation and the mysql-test directory and run the command.

cd $MYSQL_INSTALL_DIR/mysql-test
./mtr --start ndb.ndb_basic

This starts up a MySQL Server on port 13001 and a NDB management server on port 13000 together with two data nodes.

At first we need to create the tables. We run the following command in the samples/tweet directory.

mysql --protocol=tcp --port=13001 -uroot < create_tweet_tables.sql

Now the tweet.js program contains a web server that we will start on port 7800. Node.js is a programming language that is interpreted, there is no need to compile it before running it. The compilation is done by the Node.js execution environment.

Before we do this we need to set the environment deployed to use the MTR setup. To do this we perform the following commands:

export JONES_ADAPTER="ndb"
export JONES_DEPLOYMENT="mtr"
export LD_LIBRARY_PATH="$MYSQL_INSTALL_LIB/lib"
node tweet start server 7800

Now we have a web server running on port 7800. If we run on Mac OS X we need to change the LD_LIBRARY_PATH to DYLD_LIBRARY_PATH.

In the last line node is the Node.js execution environment. tweet means that the file tweet.js is executed. start server 7800 are the three startup parameters to this tweet program.

export needs to be replaced by setenv on some shell implementations.

This is the general method of executing a Node.js program.

node my_program_name program_parameters

This will execute the program in my_program_name.js. Since Node.js is an interpreted language, there is no need to compile the program before executing it. The compilation happens on the fly.

The next step is to populate the database with the initial data. This is performed using the script demo_populate.sh. This script is hardcoded for some reason to use the mysql adapter. If you haven’t installed the mysql adapter you need to manually change to use ndb adapter on one of the first lines in this script (the export command).

Now we run the demo_http_get.sh that runs some http requests such as this one:

curl http://localhost:7800/tweets-about/carthage

The variable JONES_DEPLOYMENT affects what deployment is choosen from the file jones_deployment.js in the top directory for this test run.

ConnectionProperties#

The first part of the program always calls require with database-jones as parameter. The next step is to create an object of type ConnectionProperties. This object decides how to connect to RonDB.

mysql_host, mysql_user, mysql_port and mysql_password contains the parameters needed to connect to the MySQL Server using the mysql adapter.

None of these are needed for the ndb adapter. But they could be used to create tables from the ndb adapter using a MySQL Server.

It is also possible to set mysql_charset, mysql_socket, mysql_sql_mode, mysql_pool_size and a few debug parameters on the mysql adapter. These are not possible to set on the ndb adapter.

Common ConnectionProperties#

There is only one common ConnectionProperties for the mysql and the ndb adapter. This is the database property. This is by default equal to test for both adapters.

ndb ConnectionProperties#

ndb_connect_retries#

Defaults to 4. This is the number of retries we will perform to connect to the NDB management server before giving up. A negative value here means that there is no limit on the number of retries, it will be retried indefinitely.

ndb_connect_delay#

This represents the delay between attempts to connect to the cluster. By default it is set to 1 second.

ndb_connect_verbose#

If set this variable ensures that printouts to the console is made of progress on cluster connect setup. By default it is set.

ndb_connect_retries, ndb_connect_delay and ndb_connect_verbose are all passed to the call where the cluster connection is made.

linger_on_close_msec#

This parameter sets a timeout on how long we will wait before we start closing the cluster connection when the connection was closed. By default this is set to 500 milliseconds.

use_ndb_async_api#

Setting this to true means that we are going to use the asynchronous NDB API functionality to increase the parallelism in using the NDB API. By default this is set to false.

use_mapped_ndb_record#

Setting this to true means that we keep results fetched from NDB in NDB API buffers, otherwise they are copied out to JavaScript objects. Defaults to true.

ndb_session_pool_min#

Each cluster connection uses a pool of NDB objects (Each Session object in the ndb adapter maps to one NDB object. This parameter sets the minimum number of objects in this pool. It defaults to 4.

ndb_session_pool_max#

This parameter sets the maximum number of objects in the NDB object pool. By default it is set to 100.

ndb_session_concurrency#

The maximum number of transactions in an NDB session, only one is visible to the user, but multiple may be started before the current have finished. Defaults to 4. If the asynchronous API is used it is recommended to increase this parameter to 1024 since the asynchronous API can run many transactions in parallel.

Creating a ConnectionProperties object#

This object is created by calling ConnectionProperties with the first parameter set to a string with either ndb or mysql dependent on which adapter is used.

The second parameter is not needed, if it is not set a default object is returned. Next the application can change this object programmatically as desired before proceeding.

The second parameter can also be a function in which case this function is used to fill in the variables in this object.

The second argument can be a string pointing to which deployment to use (several can be defined).

The deployment is found in a file called jones_deployment.js. An example file of this is found in the top directory of the mysql-js installation.

The search starts for this file starts in the directory containing the currently executing main source file. Next it tries in the directory above and continues until it reaches the root of the file system. It attempts to find it in the directory that is the current working directory.

Alternative 1:
  var con_prop = new db_con.ConnectionProperties("ndb");
  con_prop.database = "new_db";

Alternative 2:
  function set_prop(props) {
    props.database = "new_db";
  }
  var con_prop = new db_con.ConnectionProperties("ndb", set_prop);

Alternative 3:
  var con_prop = new db_con.ConnectionProperties("ndb", "test");

Cluster connection object#

After calling require on Database Jones and creating a ConnectionProperties object, we are ready to create a cluster connection object.

For the mysql adapter this object is simply a pool of connections the MySQL Server. Each session will be represented by a connection to the MySQL Server.

For the ndb adapter this object creates a cluster connection. Thus we can use multiple such objects for one Node.js server since NDB supports having multiple cluster connections from one program.

Each cluster connection will have its own SessionFactory returned and when a Session object is created it will use an NDB object to map the Session calls to NDB API calls.

The call to create the SessionFactory object is created using two parameters, the ConnectionProperties object and a mappings object. The mappings object can be an array of table mappings that can be used by the cluster connection.

Table mappings#

Table mappings in Database Jones is very similar to the annotations used in ClusterJ. In our example we will use the Customer table. At first we need to create a function to fill in the Customer object. In this example the JavaScript object have two fields id and name. The table uses the column names customer_id and name.

This object can be created when reading the row from the database when it is called without parameters, it can also be called from application logic with proper parameters. By checking the first parameter for undefined we can derive whether to set the variables.

  function Customer(customerId, customerName) {
    if (customerId != undefined)
    {
      this.id = customerId;
      this.name = customerName;
    }
  }

The simplest method to map this table would be the following code.

  var db_con = require("database-jones");
  var cust_map = new db_con.TableMapping('customer').
    applyToClass(Customer);

Now this won’t work since the id is stored in a column called customer_id. We need to define also the field mappings like this:

  var db_con = require("database-jones");
  var cust_map = new db_con.TableMapping('customer');
  cust_map.mapField("id", "customer_id");
  cust_map.mapField("name");
  cust_map.applyToClass(Customer);

Here we provide a mapping between the id used in the JavaScript object and the customer_id used in the table.

It is also possible to specify a Converter object that converts between database types and JavaScript types. It is also possible to set a field to not be persistent. For more details on this see the documentation in the database-jones/API-documentation directory.

It is not necessary to specify indexes in the table mapping. For all key lookup operations (both read and write) all primary key columns must be specified in the object. Alternatively all columns of a unique key column must be specified. Otherwise an exception is generated on the call on the Session object.

The table mappings will use all columns of the table. If the table changes such that a new column is added while the cluster connection is up and running the new column will not be part of the mapping. Thus schema modifications and software upgrade must be closely aligned.

Startup example#

  var db_con = require("database-jones");
  var con_prop = new db_con.ConnectionProperties("ndb", "test");
  function Customer(customerId, customerName) {
    if (customerId != undefined)
    {
      this.id = customerId;
      this.name = customerName;
    }
  }
  table_maps = [];
  var cust_map = new db_con.TableMapping('customer');
  cust_map.mapField("id", "customer_id");
  cust_map.mapField("name");
  table_maps.push(cust_map.applyToClass(Customer));
  function successFunction() {
    ...
  }
  function failFunction() {
    ...
  }
  function exampleOperation(session) {
    ...
  }
  db_con.connect(con_prop, table_maps).
    then(function() { return sessionFactory.openSession()}).
    then(exampleOperation, failFunction);

The Node.js API has a lot of similarities to the ClusterJ API. This isn’t surprising since the ClusterJ architect was involved also in designing the Node.js API.

There is a main difference though in that Node.js is a language designed for asynchronous operation. Each time we call a function that accesses the NDB data node we can use callbacks to continue the execution. This makes it possible to perform rather complex interactions with several steps with a few lines of simple code.

We use the standard promises design pattern in some of those examples. The reason is that most of the asynchronous calls in Database Jones return a promise that is Promises/A+ compatible.

The idea with promises is that they return either with success plus a return value or it returns with a failure and a return value. The return value can be multiple objects.

Putting these things together our example codes will always use the following startup code. Here exampleOperation will be replaced by the operation we want to show an example of in coming sections. We use the SessionFactory returned by the connect call to create a Session object to use in our examples. We ignore the error handling function and the function to use at success here.

Session operations#

Inserting a row#

The below example shows a very simple example of an insert row. We create a new Customer object and call the method persist on the Session object. This performs an insert of the provided object.

  function insertOperation(session) {
    newCust = new Customer(100, "Mikael");
    session.persist(newCust).
      then(successFunction, failFunction);
  }

Updating a row#

To update the row instead of inserting we simply replace the persist function with the update function.

  function updateOperation(session) {
    newCust = new Customer(100, "Mikael");
    session.update(newCust).
      then(successFunction, failFunction);
  }

Delete a row#

To delete a row we only need to specify the primary key and we use the remove function to perform the delete operation.

  function deleteOperation(session) {
    newCust = new Customer(100);
    session.remove(newCust).
      then(successFunction, failFunction);
  }

Write a row#

To write a row (insert if row with same primary key doesn’t exist and update if it previously existed), one uses the save method.

  function writeOperation(session) {
    newCust = new Customer(100, "Mikael");
    session.save(newCust).
      then(successFunction, failFunction);
  }

Read and update a row#

Now we move on to a slightly more advanced example. In the previous examples we executed in autocommit mode. Only one operation was executed in the transaction, there was no need to introduce any transactions. Now we move onto a transaction where we first want to read the row, followed by an update of the row where want to change the name of the customer.

In this case we create a new function changeCustomer to handle the change of the customer name.

We introduce a transaction that we begin before reading the row and complete after updating it. The method currentTransaction returns the transaction where we can apply the functions begin(), commit(), rollback(). We can also set the transaction to be only abortable by a call to setRollbackOnly() and we can check if it is only abortable by calling isRollbackOnly().

We set the Session to now work with exclusive locks on reads to ensure that the transaction doesn’t deadlock after attempting to perform a lock upgrade from shared lock to exclusive lock. The lock modes that can be set are EXCLUSIVE, SHARED and NONE. NONE corresponds to READ COMMITTED.

When a transaction is started on the Session object the call to update will return immediately, it will be executed by the next operation that requires an interaction with the NDB data nodes (calls to flush, find, commit or rollback).

  function readOperation(session) {
    this.session = session;
    function changeCustomer(customer, name) {
      customer.name = name;
      session.update(customer);
    }
    var customerName = "Mikael";
    var customerId = 100;
    var customer = new Customer(customerId);
    session.currentTransaction().begin();
    session.lockMode(EXCLUSIVE);
    session.find("customer", customer).
      then(changeCustomerName(customer,customerName)).
      then(session.currentTransaction().commit()).
      then(successFunction, failFunction);
  }

Batching multiple operations#

In this example we show how to batch two insert operations into one execution towards the NDB data nodes.

To do this we introduce a batch object by calling the method createBatch on the Session object. This object have more or less the same interface as the Session object except that the calls are batched for later execution instead of executed immediately.

To execute the batched operations we need to call execute on the batch object.

  function insertMultipleRows(session) {
    batch = session.createBatch();
    newCust1 = new Customer(101, "Mikael jr");
    newCust2 = new Customer(102, "Mikaela");
    batch.persist(newCust1);
    batch.persist(newCust2);
    batch.execute().
      then(successFunction, failFunction);
  }

We can also batch multiple keyed reads using the same batch interface as shown here. After reading the rows we want to update them again after changing the customer names.

  function insertMultipleOps(session, cust1, cust2) {
    cust1.name = "Michael jr";
    cust2.name = "Michaela";
    batch.update(cust1);
    batch.update(cust2);
    batch.execute().
      then(session.currentTransaction().commit()).
      then(successFunction, failFunction);
  }
  function readAndInsertMultipleRows(session) {
    session.currentTransaction().begin();
    session.lockMode(EXCLUSIVE);
    batch = session.createBatch();
    var newCust1 = Customer(101);
    var newCust2 = Customer(102);
    batch.find("customer", newCust1);
    batch.find("customer", newCust2);
    batch.execute().
      then(insertMultipleOps(session, newCust1, newCust2)),
           failFunction);
  }

Scanning rows#

There is also an interface to scan tables through an ordered index. To perform this we need to first create a query object. When creating the query object we can supply the table name or a table mapping object.

This method returns a Query object. Through this query object we can define the query search condition. We define one predicate per call. All functions that define predicates return a query predicate that can be passed into a new predicate function.

The conditions supported are eq (equality condition), ne (not equal), gt (greater than), ge (greater than or equal), lt (less than), le (less than or equal), isNull, isNotNull.

There are numerous functions to join two predicates.

and(predicate) (if both predicates are true), or, andNot (this query predicate with the negation of the parameter predicate), orNot, not.

function handle_results(err, results) {
  if (err) { ... } // Error code
  else     { ... } // Successful code
}
function query_definition(query) {
  query.where(query.name.eq("Mikael"));
  return query.execute("order" : "desc");
}
function scanCustomers(session) {
  session.createQuery("customer").
    then(query_definition(query)).
    then(handle_results(err, results),failFunction);

In the example we use the short form where query.name refers to the name column. In the execute we define the query to be executed in descending order. It is also possible to set a limit in the same manner that limits the number of rows to return. Also skip that skips the first number of rows before starting to report rows.

Default order is no order, default limit is several billions of rows and skip defaults to 0.

The result of the execute method is returned with an error condition and an array of result objects. The result objects are in this case a set of Customer objects.

Join operations#

There is an interface to define join operations using Database Jones. This uses the Projection interface. This code is a bit less stable than the rest of the code, I will skip describing it here. It is documented in the same place as the rest of the code.

We recommend using SQL to execute complex queries such as joins between tables, as this is the most well tested method of executing complex queries. If the query is complex it makes sense to pay the little extra cost in shipping it over to the MySQL Server for execution.

Final comments#

The above examples were developed to show one way of using Database Jones to develop a web application. More complete examples are found in the samples/tweet directory.

The API that Database Jones supports makes it possible to access NDB in many more variants than I have shown here. These examples are there to get you started to develop more interesting applications using Database Jones.

The complete API documentation is found in the database-jones/API-documentation directory.