Release Notes RonDB 21.04.15#

RonDB 21.04.15 is the fifteenth release of RonDB 21.04.

RonDB 21.04 is based on MySQL NDB Cluster 8.0.23. It is a bug fix release based on RonDB 21.04.14.

RonDB 21.04.15 is released as open source software with binary tarballs for usage in Linux. It is developed on Linux and Mac OS X and using WSL 2 on Windows (Linux on Windows) for automated testing.

RonDB 21.04.15 can be used with both x86_64 and ARM64 architectures although ARM64 is still in beta state.

The RonDB 21.04.15 is tested and verified on both x86_64 and ARM platforms using both Linux and Mac OS X. It is however only released with a binary tarball for x86_64 on Linux.

In the git log there is also a lot of commits relating to the REST API server. This is still under heavy development, thus we don’t mention in the release notes the changes related to this development until the REST API server is released. From RonDB 21.04.14 the REST API is at a quality level where we have released it for production usage.

Also build fixes are not listed in the release notes, but can be found in the git log.

This is the last RonDB 21.04 release to be used in new Hopsworks releases. There might still be bug fix releases to support existing Hopsworks releases. The main focus for new Hopsworks releases is now moved to the RonDB 22.10 release series.

Description of RonDB#

RonDB is designed to be used in a managed environment where the user only needs to specify the type of the virtual machine used by the various node types. RonDB has the features required to build a fully automated managed RonDB solution.

It is designed for appplications requiring the combination of low latency, high availability, high throughput and scalable storage.

You can use RonDB in a serverless version on app.hopsworks.ai. In this case Hopsworks manages the RonDB cluster and you can use it for your machine learning applications. You can use this version for free with certain quotas on the number of Feature Groups (tables) and memory usage. Getting started with this is a matter of a few minutes, since the setup and configuration of the database cluster is already taken care of by Hopsworks.

You can also use the managed version of RonDB available on hopsworks.ai. This sets up a RonDB cluster in your own AWS, Azure or GCP account using the Hopsworks managed software. It creates a RonDB cluster provided a few details on the HW resources to use. These details can either be added through a web-based UI or using Terraform. The RonDB cluster is integrated with Hopsworks and can be used for both RonDB applications as well as for Hopsworks applications.

You can use the cloud scripts that will enable you to set up in an easy manner a cluster on AWS, Azure or GCP. This requires no previous knowledge of RonDB, the script only needs a description of the HW resources to use and the rest of the setup is automated.

You can use the open source version and build and set it up yourself. This is the command you can use to download a binary tarball:

# Download x86_64 on Linux
wget https://repo.hops.works/master/rondb-21.04.15-linux-glibc2.17-x86_64.tar.gz

RonDB 21.04 is a Long Term Support version that will be maintained until at least 2024.

Maintaining 21.04 means mainly fixing critical bugs and minor change requests. It doesn’t involve merging with any future release of MySQL NDB Cluster, this will be handled in newer RonDB releases.

Backports of critical bug fixes from MySQL NDB Cluster will happen when deemed necessary.

Summary of changes in RonDB 21.04.15#

RonDB has 16 bug fixes since RonDB 21.04.14 and 4 new features. In total RonDB 21.04 contains 35 new features on top of MySQL Cluster 8.0.23 and a total of 148 bug fixes.

Test environment#

RonDB uses four different ways of testing. MTR is a functional test framework built using SQL statements to test RonDB. The Autotest framework is specifically designed to test RonDB using the NDB API. The Autotest is mainly focused on testing high availability features and performs thousands of restarts using error injection as part of a full test suite run. Benchmark testing ensures that we maintain the throughput and latency that is unique to RonDB. Finally we also test RonDB in the Hopsworks environment where we perform both normal actions as well as many actions to manage the RonDB clusters.

RonDB has a number of unit tests that are executed as part of the build process to improve the performance of RonDB.

In addition RonDB is tested as part of testing Hopsworks.

MTR testing#

RonDB has a functional test suite using the MTR (MySQL Test Run) that executes more than 500 RonDB specific test programs. In addition there are thousands of test cases for the MySQL functionality. MTR is executed on both Mac OS X and Linux.

We also have a special mode of MTR testing where we can run with different versions of RonDB in the same cluster to verify our support of online software upgrade.

Autotest#

RonDB is highly focused on high availability. This is tested using a test infrastructure we call Autotest. It contains many hundreds of test variants that takes around 36 hours to execute the full set. One test run with Autotest uses a specific configuration of RonDB. We execute multiple such configurations varying the number of data nodes, the replication factor and the thread and memory setup.

An important part of this testing framework is that it uses error injection. This means that we can test exactly what will happen if we crash in very specific situations, if we run out of memory at specific points in the code and various ways of changing the timing by inserting small sleeps in critical paths of the code.

During one full test run of Autotest RonDB nodes are restarted thousands of times in all sorts of critical situations.

Autotest currently runs on Linux with a large variety of CPUs, Linux distributions and even on Windows using WSL 2 with Ubuntu.

Benchmark testing#

We test RonDB using the Sysbench test suite, DBT2 (an open source variant of TPC-C), flexAsynch (an internal key-value benchmark), DBT3 (an open source variant of TPC-H) and finally YCSB (Yahoo Cloud Serving Benchmark).

The focus is on testing RonDBs LATS capabilities (low Latency, high Availability, high Throughput and scalable Storage).

Hopsworks testing#

Finally we also execute tests in Hopsworks to ensure that it works with HopsFS, the distributed file system built on top of RonDB, and HSFS, the Feature Store designed on top of RonDB, and together with all other use cases of RonDB in the Hopsworks framework.

These tests include both functional tests of the Hopsworks framework as well as load testing of HopsFS and Hopsworks.

New features#

RonDB REST API Server#

This new feature is a major new open source contribution by the RonDB team. It has been in the works for almost a year and is now used by some of our customers.

The REST API Server has two variants. The first variant provides read access to tables in RonDB using primary key access. The reads can either read one row per request or use the batch variant that can read multiple rows from multiple tables in one request. This variant supports both access using REST and gRPC. The REST API is by default available on port 4406 and the gRPC is by default using port 5406.

The second variant is to use the Feature Store REST API. This interface is used by Hopsworks applications that want direct read access to Feature Groups in Hopsworks.

The REST API server described above is just a first version. We aim to extend it both in terms of improved performance, more functionality and also adding advanced features.

The binary for the REST API server is called rdrs and is found in the RonDB binary tarballs.

The documentation of the REST API server is found here.

The documentation of the REST API for the Feature Store is found here

RONDB-468: Separate connections for data and metadata operations#

The REST API server needs to read metadata to perform its services. This feature makes it possible to store the metadata in a separate RonDB cluster from where the feature data is stored.

RONDB-473: Minor adjustments of hash table sizes and hash functions#

RONDB-342: New hash function XX3_HASH64#

A new hash function is supported by newer 21.04 versions, but 21.04 will not create any new tables with new hash function, but can use tables created by a newer RonDB version using the new hash function. This ensures that older NDB API can coexist in newer 22.10-based clusters, in particular supporting downgrade from 22.10 to 21.04.15.

FSTORE-953: Add feature store documentation#

BUG FIXES#

RONDB-500: Increase connection timeout for REST API Server#

RONDB-475: Crash when combining NDB Stored users with MySQL Replication#

In certain situations a user command was replicated and was running a SELECT command, this failed since the command had no result set although it was successful. This was due to the slave_thread variable being set which meant that SELECT queries were ignored. A new test case and a fix for this was done.

RONDB-477: Added Health endpoint to REST API Server#

FSTORE-104: Failed to return features when only primary key is selected#

RONDB-478: RONDB-478: Crash in jam from query thread#

A query thread executed a function which didn’t have a block object and failed to bring the jam buffer, thus the function crashed when trying to insert into the jam buffer.

RONDB-459: RONDB-459: Fix bug in duplicate hostname check#

The method sendStopMgmd is used for both shutdown and stop of nodes. When deactivating a node we stop the node after deactivating it. In the case of deactivation of a MGM server we thus need to ensure that we stop a node even if it is deactivated since it is part of the deactivation. This required a new parameter to sendStopMgmd to decide whether to handle stop of deactivated nodes or not.

Added checks that we don’t stop deactivated nodes, they should already be stopped.

RONDB-472: Fixed latency report on transactions in flexAsynch#

RONDB-472: More flags for flexAsynch and improved reporting#

-dirty flag removed and replaced by 3 new flags: -dirty_write: Writes and updates are with dirty flag -dirty_read: Reads are using dirty flag (default setting) -locked_read: Normal reads using locks

RONDB-471: Memory leak for dirty writes#

Memory leak of Commit Ack Marker records leading to lost memory and much worsened performance for all operations. Fixed by not setting Marker flag in LQHKEYREQ for dirty writes since the those operations will release the commit ack marker without contacting DBLQH.

RONDB-282: Fix a test case for parallel copy fragment#

As part of parallel copy fragment process improvements the amount of outstanding LQHKEYREQ increased. In one test case using error inject 5106 we delay LQHKEYCONF, however the increase of outstanding LQHKEYREQ led to running out of space in Long time delay queue.

Changed test case such that it only delays every 20th LQHKEYCONF signal instead of delaying every signal.

FSTORE-982: Adopt serving key in feature store Rest API#

RONDB-449: Increased number of slots in time queues#

RONDB-442: Metrics Refactoring of REST API#

RONDB-415: Fixing failed sub operation in Batch requests for REST API#

RONDB-440: Fixed a hole in fragment protection for multi range scans#

When performing a multi range scan we setup the second range scan using the method that also acquire the fragment mutex if required. However immediately after that we released the fragment lock and returned, after the return we continued scanning the new range, but now without the fragment lock since it was released.

This can cause problems only in Query thread scans and only with range scans scanning more than 1 range and of course executing at a time when someone is changing the ordered index.