Release Notes RonDB 22.10.7#

RonDB 22.10.2 is the seventh release of the RonDB 22.10 series. In Hopsworks it is integrated into Hopsworks version 3.9 and 4.x versions.

RonDB 22.10.7 is based on MySQL NDB Cluster 8.0.34 and RonDB 21.04.17.

RonDB 21.04 is a Long-Term Support version of RonDB that will be supported at least until 2024.

RonDB 22.10 is a new Long-Term support version and will be maintained at least until 2025.

RonDB 22.10 is released as open source SW with binary tarballs for usage in Linux. It is developed on Linux and Mac OS X and using WSL 2 on Windows (Linux on Windows).

RonDB 22.10.2 and onwards is supported on Linux/x86_64 and Linux/ARM64.

The other platforms are currently for development and testing. Mac OS X is a development platform and will continue to be so.

Description of RonDB#

RonDB is designed to be used in a managed cloud environment where the user only needs to specify the type of the virtual machine used by the various node types. RonDB has the features required to build a fully automated managed RonDB solution.

It is designed for appplications requiring the combination of low latency, high availability, high throughput and scalable storage (LATS).

You can use RonDB in a Serverless version on app.hopsworks.ai. In this case Hopsworks manages the RonDB cluster and you can use it for your machine learning applications. You can use this version for free with certain quotas on the number of Feature Groups (tables) you are allowed to add and quotas on the memory usage. You can get started in a minute with this, no need to setup any database cluster and worry about its configuration, it is all taken care of.

You can use the managed version of RonDB available on hopsworks.ai. This sets up a RonDB cluster in your own AWS, Azure or GCP account using the Hopsworks managed software. This sets up a RonDB cluster provided a few details on the HW resources to use. These details can either be added through a web-based UI or using Terraform. The RonDB cluster is integrated with Hopsworks and can be used for both RonDB applications as well as for Hopsworks applications.

You can use the cloud scripts that will enable you to set up in an easy manner a cluster on AWS, Azure or GCP. This requires no previous knowledge of RonDB, the script only needs a description of the HW resources to use and the rest of the set up is automated.

You can use the open source version and use the binary tarball and set it up yourself.

You can use the open source version and build and set it up yourself.

This is the commands you can use to retrieve the binary tarball:

# Download x86_64 on Linux
wget https://repo.hops.works/master/rondb-22.10.7-linux-glibc2.28-x86_64.tar.gz
# Download ARM64 on Linux
wget https://repo.hops.works/master/rondb-22.10.7-linux-glibc2.28-arm64_v8.tar.gz

These versions are also available as Docker containers at hub.docker.com under hopsworks/rondb. The 22.10.7 is currently the latest version. See https://hub.docker.com/r/hopsworks/rondb/tags for an up to date of which RonDB container images are available. These containers can be used to run RonDB in containers, in Hopsworks they are also used since version 4.0 as containers in the Hopsworks Kubernetes cluster.

The actions to build both x86_64 and ARM64 tarballs have now been fully automated.

Summary of changes in RonDB 22.10.3 to 22.10.7#

RonDB 22.10.7 is based on MySQL NDB Cluster 8.0.34 and RonDB 21.04.17.

RonDB 22.10.7 adds 17 new features on top of RonDB 21.04.16, and adds 52 new features on top of MySQL NDB Cluster 8.0.34. In addition it adds a new product, the REST API Server that can be used to access data in RonDB using a REST protocol. The current version of this product is implemented in Go, a new version implemented in C++ is in development that will be introduced in RonDB 24.10.0.

Test environment#

RonDB uses four different ways of testing. MTR is a functional test framework built using SQL statements to test RonDB.

The Autotest framework is specifically designed to test RonDB using the NDB API. The Autotest is mainly focused on testing high availability features and performs thousands of restarts using error injection as part of a full test suite run.

Benchmark testing ensures that we maintain the throughput and latency that is unique to RonDB. The benchmark suites used are integrated into the RonDB binary tarball making it very straightforward to run benchmarks for RonDB.

Finally we also test RonDB in the Hopsworks environment where we perform both normal actions as well as many actions to manage the RonDB clusters.

RonDB has a number of MTR tests that are executed as part of the build process to improve the performance of RonDB.

MTR testing#

RonDB has a functional test suite using the MTR (MySQL Test Run) that executes more than 500 RonDB specific test programs. In adition there are thousands of test cases for the MySQL functionality. MTR is executed on both Mac OS X and Linux.

We also have a special mode of MTR testing where we can run with different versions of RonDB in the same cluster to verify our support of online software upgrade.

Autotest#

RonDB is very focused on high availability. This is tested using a test infrastructure we call Autotest. It contains also many hundreds of test variants that takes around 36 hours to execute the full set. One test run with Autotest uses a specific configuration of RonDB. We execute multiple such configurations varying the number of data nodes, the replication factor and the thread and memory setup.

An important part of this testing framework is that it uses error injection. This means that we can test exactly what will happen if we crash in very specific situations, if we run out of memory at specific points in the code and various ways of changing the timing by inserting small sleeps in critical paths of the code.

During one full test run of Autotest, RonDB nodes are restarted thousands of times in all sorts of critical situations.

Autotest currently runs on Linux with a large variety of CPUs, Linux distributions and even on Windows using WSL 2 with Ubuntu.

Benchmark testing#

We test RonDB using the Sysbench test suite, DBT2 (an open source variant of TPC-C), flexAsynch (an internal key-value benchmark), DBT3 (an open source variant of TPC-H) and finally YCSB (Yahoo Cloud Serving Benchmark).

The focus is on testing RonDBs LATS capabilities (low Latency, high Availability, high Throughput and scalable Storage).

Hopsworks testing#

Finally we also execute tests in Hopsworks to ensure that it works with HopsFS, the distributed file system built on top of RonDB, and HSFS, the Feature Store designed on top of RonDB, and together with all other use cases of RonDB in the Hopsworks framework.

New features#

RONDB-789: Find out memory availability in a container#

RonDB uses Automatic Memory Configuration as default. In this setting RonDB will discover the amount of memory available and allocate most of the available memory to the RonDB data nodes. In Linux using VMs or bare metal servers this information is found in /proc/meminfo. However running in a container the information is instead stored in /sys/fs/cgroup/memory.max. The setting of the amount of memory to be available is set in the RonDB Helm charts and can thus now be automatically detected without extra configuration variables. Setting TotalMemoryConfig will still override the discovered memory size.

RONDB-785: Set LocationDomainId dynamically#

With RonDB Kubernetes support it is very easy to setup the cluster in such a way that nodes in the RonDB cluster are spread in several Availability Zones (Availability Domains in Oracle Cloud). In order to avoid sending network messages over Availability Zone boundaries more than necessary we try to locate the transaction coordinator in our domain and read data from our domain if possible.

To avoid complex Kubernetes setups this required the ability to set the domain in the RonDB data node container using a RonDB management client command. In RonDB we use a Location Domain Id to figure out which Availability Zone we are in. How this is set is up to the management software (Kubernetes and containers in our case).

This features makes it possible to access RonDB through a network load balancer that chooses a MySQL Server (or RDRS server) in the same domain, this MySQL Server will contact a RonDB data node in the same domain and finally the RonDB data node will ensure that it reads data from the same domain. Thus we can completely avoid any network messages that passes over domain boundaries for key lookups that reads.

RONDB-784: Performance improvement for Complex Features in Go REST API Server#

Bump GO version to 1.22.9.
Use hamba avro as a replacement for linkedin avro library to deserialize complex features
Avoid json.Unmarshal when parsing complex feature field
Use Sonic library to serialize JSON before sending to the client.

This feature cuts latency of complex feature processing to half of what it used to be. This significantly improves latency of Feature Store REST API lookups.

RONDB-776: Changed hopsworks.schema to use TEXT data type#

Impacts the GO REST API Server and its Feature Store REST API.

Bug Fixes#

Backport of Oracle Bug 35925503#

Fixing Message "Metadata: Failed to submit table ’mysql.ndb_apply_status’ for synchronization" is submitted every minute

Backport of fix of Mac OS X issue provided by Laurynas Biveinis to MySQL#

Fix for compilation using LLVM17 backported from FreeBSD fix of MySQL#

RONDB-686: Log pages can span 3 pages#

Ensure ZMIN_READ_BUFFER_SIZE reflects this fact, otherwise a data node must be restarted with initial node restart.

RONDB-687: Increase timeout before crash in overloaded environment#

In situations with VMs that have CPUs shared with other VMs we can suffer from having access to CPUs that are far between and very short lived when getting it. This can persist for a while, we increase the timeout check of the scanning for timeout to be in the order of 10 seconds instead of 1 second. The 1 second has been seen to cause issues and thus this should resolve this.

RONDB-674: Fix ndbmtd crash on create index on table with old hash#

Scenario:

A table is created using an older version of RonDB. The old hash function is used since the new hash function is not supported.
RonDB is upgraded to a new version that supports the new hash function.
The old table still uses the old hash function.
An index is created on the old table.

Behavior before this fix (buggy):

NdbApi attempts to create index with new hash function, since it is supported.
ndbmtd requires that the hash flags of the index and the primary table match, and when they don’t, will crash.

Behavior after this fix (expected):

NdbApi attempts to create index with a hash flag that matches that of the primary table.
When an index creation request comes in, ndbmtd compares the hash function flags of the index and the primary table, and if necessary corrects the hash on the index creation request.