Release Notes RonDB 21.04.14#

RonDB 21.04.14 is the fourteenth release of RonDB 21.04.

RonDB 21.04 is based on MySQL NDB Cluster 8.0.23. It is a bug fix release based on RonDB 21.04.12. The release 21.04.13 was only used internally in Hopsworks.

RonDB 21.04.14 is released as open source software with binary tarballs for usage in Linux. It is developed on Linux and Mac OS X and using WSL 2 on Windows (Linux on Windows) for automated testing.

RonDB 21.04.14 can be used with both x86_64 and ARM64 architectures although ARM64 is still in beta state.

The RonDB 21.04.14 is tested and verified on both x86_64 and ARM platforms using both Linux and Mac OS X. It is however only released with a binary tarball for x86_64 on Linux.

In the git log there is also a lot of commits relating to the REST API server. This is still under heavy development, thus we don’t mention in the release notes the changes related to this development until the REST API server is released. From RonDB 21.04.14 the REST API is at a quality level where we have released it for production usage.

Also build fixes are not listed in the release notes, but can be found in the git log.

Description of RonDB#

RonDB is designed to be used in a managed environment where the user only needs to specify the type of the virtual machine used by the various node types. RonDB has the features required to build a fully automated managed RonDB solution.

It is designed for appplications requiring the combination of low latency, high availability, high throughput and scalable storage.

You can use RonDB in a serverless version on app.hopsworks.ai. In this case Hopsworks manages the RonDB cluster and you can use it for your machine learning applications. You can use this version for free with certain quotas on the number of Feature Groups (tables) and memory usage. Getting started with this is a matter of a few minutes, since the setup and configuration of the database cluster is already taken care of by Hopsworks.

You can also use the managed version of RonDB available on hopsworks.ai. This sets up a RonDB cluster in your own AWS, Azure or GCP account using the Hopsworks managed software. It creates a RonDB cluster provided a few details on the HW resources to use. These details can either be added through a web-based UI or using Terraform. The RonDB cluster is integrated with Hopsworks and can be used for both RonDB applications as well as for Hopsworks applications.

You can use the cloud scripts that will enable you to set up in an easy manner a cluster on AWS, Azure or GCP. This requires no previous knowledge of RonDB, the script only needs a description of the HW resources to use and the rest of the setup is automated.

You can use the open source version and build and set it up yourself. This is the command you can use to download a binary tarball:

# Download x86_64 on Linux
wget https://repo.hops.works/master/rondb-21.04.14-linux-glibc2.17-x86_64.tar.gz

RonDB 21.04 is a Long Term Support version that will be maintained until at least 2024.

Maintaining 21.04 means mainly fixing critical bugs and minor change requests. It doesn’t involve merging with any future release of MySQL NDB Cluster, this will be handled in newer RonDB releases.

Backports of critical bug fixes from MySQL NDB Cluster will happen when deemed necessary.

Summary of changes in RonDB 21.04.14#

RonDB has 8 bug fixes since RonDB 21.04.12 and 4 new features. In total RonDB 21.04 contains 31 new features on top of MySQL Cluster 8.0.23 and a total of 132 bug fixes.

Test environment#

RonDB uses four different ways of testing. MTR is a functional test framework built using SQL statements to test RonDB. The Autotest framework is specifically designed to test RonDB using the NDB API. The Autotest is mainly focused on testing high availability features and performs thousands of restarts using error injection as part of a full test suite run. Benchmark testing ensures that we maintain the throughput and latency that is unique to RonDB. Finally we also test RonDB in the Hopsworks environment where we perform both normal actions as well as many actions to manage the RonDB clusters.

RonDB has a number of unit tests that are executed as part of the build process to improve the performance of RonDB.

In addition RonDB is tested as part of testing Hopsworks.

MTR testing#

RonDB has a functional test suite using the MTR (MySQL Test Run) that executes more than 500 RonDB specific test programs. In addition there are thousands of test cases for the MySQL functionality. MTR is executed on both Mac OS X and Linux.

We also have a special mode of MTR testing where we can run with different versions of RonDB in the same cluster to verify our support of online software upgrade.

Autotest#

RonDB is highly focused on high availability. This is tested using a test infrastructure we call Autotest. It contains many hundreds of test variants that takes around 36 hours to execute the full set. One test run with Autotest uses a specific configuration of RonDB. We execute multiple such configurations varying the number of data nodes, the replication factor and the thread and memory setup.

An important part of this testing framework is that it uses error injection. This means that we can test exactly what will happen if we crash in very specific situations, if we run out of memory at specific points in the code and various ways of changing the timing by inserting small sleeps in critical paths of the code.

During one full test run of Autotest RonDB nodes are restarted thousands of times in all sorts of critical situations.

Autotest currently runs on Linux with a large variety of CPUs, Linux distributions and even on Windows using WSL 2 with Ubuntu.

Benchmark testing#

We test RonDB using the Sysbench test suite, DBT2 (an open source variant of TPC-C), flexAsynch (an internal key-value benchmark), DBT3 (an open source variant of TPC-H) and finally YCSB (Yahoo Cloud Serving Benchmark).

The focus is on testing RonDBs LATS capabilities (low Latency, high Availability, high Throughput and scalable Storage).

Hopsworks testing#

Finally we also execute tests in Hopsworks to ensure that it works with HopsFS, the distributed file system built on top of RonDB, and HSFS, the Feature Store designed on top of RonDB, and together with all other use cases of RonDB in the Hopsworks framework.

These tests include both functional tests of the Hopsworks framework as well as load testing of HopsFS and Hopsworks.

New features#

RonDB REST API Server#

This new feature is a major new open source contribution by the RonDB team. It has been in the works for almost a year and is now used by some of our customers.

The REST API Server has two variants. The first variant provides read access to tables in RonDB using primary key access. The reads can either read one row per request or use the batch variant that can read multiple rows from multiple tables in one request. This variant supports both access using REST and gRPC. The REST API is by default available on port 4406 and the gRPC is by default using port 5406.

The second variant is to use the Feature Store REST API. This interface is used by Hopsworks applications that want direct read access to Feature Groups in Hopsworks.

The REST API server described above is just a first version. We aim to extend it both in terms of improved performance, more functionality and also adding advanced features.

The binary for the REST API server is called rdrs and is found in the RonDB binary tarballs.

The documentation of the REST API server is found here.

The documentation of the REST API for the Feature Store is found here

RONDB-282: Parallel copy fragment process#

Copy fragment process have been limited to one fragment copy per LDM thread previously. The copy fragment is also limited by the parallelism such that at most 6000 words are allowed to be outstanding (a row counts for 56 words plus the row size in words).

This means that in particular initial node restart is fairly slow. Thus we improved the parallelism while at the same time maintaining protection against overload of the live node which could be very busy serving readers and writers of the databases.

Actually this parallelism is already implemented in DBDIH. This means that we can already send up to 64 parallel copy fragment requests to a node. However DBLQH imposes a limit of one at a time and queues the other requests.

This patch ensures that we can run up to 8 parallel copy fragment processes per LDM thread. We will get data from THRMAN about CPU load due to this, we will use this both to limit the number of concurrent copy fragment processes and also to limit the parallelism per copy fragment process.

One problem when synchronising a node and there is a lot of disk columns, is that we might get overload on the UNDO log. This would cause failure of the node restart if not avoided. To avoid this we need to be able to halt the copy fragment process and later resume it again.

This functionality was already implemented. However the implementation isn’t very likely to work. The new implementation is very simple. It sends a HALT_COPY_FRAG_REQ to the live node(s) performing the copying. It doesn’t send anything back to the starting node. It is the responsibility of the live node to halt in a proper manner. Similarly the starting node will send RESUME_COPY_FRAG_REQ when it is ok to copy again. Currently the halt will halt all copy fragment processes even if they don’t have any disk columns. It is likely that the disk columns tables are anyways the bottleneck for the node restart, so special handling wouldn’t make any major difference.

To handle up to 8 parallel copy fragment processes per LDM thread requires that we have reserved 8 operation records, there is a scan record in DBTUP, there is a scan lock record in DBTUP, there is a stored procedure record in DBTUP, there is a scan record in DBLQH and there is a scan lock record in DBACC, all of them requires 8 reserved records such that we don’t risk that the copy fragment processes runs out of memory.

A few tweaks were required to send COPY_FRAGREQ for NON_TRANSACCTIONAL use. This is part of restore fragment and is mainly used in initial node restart. Some tweaks were required here to ensure that we could start more in parallel.

Both TRANSACTIONAL and NON_TRANSACTIONAL COPY_FRAGREQ required limitations when running against an older node starting up.

A very important part of the work is to also adapt the speed of node recovery based on the CPU usage in the live node. We need to slow down when the CPU gets heavily used.

The speed of node recovery was previously limited to having 24 kBytes outstanding where 0 bytes in a row was counted as 224 bytes. This means effectively less than 50 rows at a time per thread was outstanding. We increased this substantially to now go up to 192 kBytes instead. Thus around 400 small rows can be outstanding. This should provide a good balance between latency of operations towards the live node and the speed of node recovery.

The CPUs have become much faster and on some fast CPUs adding microseconds wasn’t good enough, we had to raise the level to handle nanoseconds. This had an impact on our measurements of CPU load which is a very important part of this patch for adaptive speed of node recovery.

As part of this patch we also disable ACC scans (previously done in 22.10 branch).

RONDB-364: Add service-name parameter to ndb_mgmd and ndbmtd#

This parameter adds a --service-name parameter to ndb_mgmd and ndbmtd. E.g. --service-name=ndbmtd sets the file name of the pid file to ndbmtd.pid and node log to ndbmtd_out.log and similarly for trace files and other log files and the error file. Also sets the directory name of the NDB file system to service_name_fs.

RONDB-385: Update dbt2 and sysbench versions to include new changes#

BUG FIXES#

Bug33172887 password_last_changed is geting updated on sql node restart#

Bug33542052 grant ndb_stored_user alters user with wrong hash#

There were issues with certain passwords that created hashes that contained spaces. These spaces weren’t properly handled that led to issues in the MySQL Server. The above two bug fixes were backported to fix those issues.

RONDB-244: Auto reconnect after temporary network failures#

RONDB-382: More updates to change to GCC 10 and openssl 1.1.1t#

RONDB-395: New retriable error codes#

When a node failure happens, e.g. when running an OLTP RW transaction in Sysbench, we will often report error code 4006. This error code indicates overload and isn’t necessarily leading to a retry by the application logic. There are two main error categories on the MySQL layer that leads to a retry. These are lock wait timeout and deadlock detected. Thus all failures of transaction due to node failures should be mapped to lock wait timeout.

The specific issue with 4006 is when we execute a transaction and we need to perform a scan as part of this transaction. In this case we need the scan to get a connection object from the same data node where the transaction coordinator resides. If not possible the transaction must fail. We will report a new error code 4042 that is mapped to lock wait timeout in this case.

We also map a node failure error from DBSPJ to lock wait timeout as well.

RONDB-267: Removed log message sent too often#

A message stating MGMD set client ... was printed every time a MGM client was connected in the MGM server log which filled the log a bit too much. Moved to only being printed in debug binaries.

RONDB-409: Out of CopyActiveRecord in node restart#

During a node restart with 1 LDM thread we could run out of CopyActiveRecord’s with many tables. This was due to three reasons:

1) We didn’t release the CopyActiveRecord after using it. This is true also for CopyFragmentRecord’s. Fixed by release call on pool after removeFirst on queue.

2) Second the number of COPY_ACTIVEREQ in queue has a very high boundary. So whatever setting we use here will be insufficient. The current setting is set according to max number of LDM threads which is actually unrelated. To solve this we move the CopyActiveRecord to be a TransientPool record instead.

This problem doesn’t exist for COPY_FRAGREQ. For COPY_FRAGREQ we can at most have 64 outstanding records from DBDIH. Thus we keep CopyFragmentRecord in ArrayPool.

3) Since we can send unlimited amount of COPY_ACTIVEREQ’s a long-term change is likely to limit this. Otherwise we might also get problems with exploding job buffers. This is however a more intrusive change and thus moved to a new feature release.

In addition when seizing CopyActiveRecord and CopyFragmentRecord we didn’t check that the seize went ok. Added ndbrequire around those calls.

Finally added signal printers for CopyActive and CopyFrag signals.

Bug26974491 FIXES DATA RESTORE WITH DISABLE INDEXES#

Ensure ndb_restore will drop indexes and foreign keys when used with the option --disable-indexes. Previously this was only working when used in conjunction with --restore-meta. This bug fix is backported from a newer MySQL version.