Release Notes RonDB 21.04.8#
RonDB 21.04.8 is the eigth release of RonDB 21.04.
RonDB is based on MySQL NDB Cluster 8.0.23. It is a bug fix release based on RonDB 21.04.7. RonDB 21.04.7 was only released internally, so these release notes cover both changes in RonDB 21.04.7 and RonDB 21.04.8.
RonDB 21.04.8 is released as open source SW with binary tarballs for usage in Linux usage. It is developed on Linux and Mac OS X and using WSL 2 on Windows (Linux on Windows).
RonDB 21.04.8 can be used with both x86_64 and ARM64 architectures although ARM64 is still in beta state.
The RonDB 21.04.8 is tested and verified on both x86_64 and ARM platforms using both Linux and Mac OS X. It is however only released with a binary tarball for x86_64 on Linux.
Description of RonDB#
RonDB is designed to be used in a managed environment where the user only needs to specify the type of the virtual machine used by the various node types. RonDB has the features required to build a fully automated managed RonDB solution.
It is designed for appplications requiring the combination of low latency, high availability, high throughput and scalable storage.
You can use RonDB in a Serverless version on app.hopsworks.ai. In this case Hopsworks manages the RonDB cluster and you can use it for your machine learning applications. You can use this version for free with certain quotas on the number of Feature Groups (tables) you are allowed to add and quotas on the memory usage. You can get started in a minute with this, no need to setup any database cluster and worry about its configuration, it is all taken care of.
You can use the managed version of RonDB available on hopsworks.ai. This sets up a RonDB cluster in your own AWS, Azure or GCP account using the Hopsworks managed software. This sets up a RonDB cluster provided a few details on the HW resources to use. These details can either be added through a web-based UI or using Terraform. The RonDB cluster is integrated with Hopsworks and can be used for both RonDB applications as well as for Hopsworks applications.
You can use the cloud scripts that will enable you to set up in an easy manner a cluster on AWS, Azure or GCP. This requires no previous knowledge of RonDB, the script only needs a description of the HW resources to use and the rest of the set up is automated.
You can use the open source version and use the binary tarball and set it up yourself.
You can use the open source version and build and set it up yourself. This is the commands you can use:
# Download x86_64 on Linux wget https://repo.hops.works/master/rondb-21.04.8-linux-glibc2.17-x86_64.tar.gz
RonDB 21.04 is a Long Term Support version that will be maintained until at least 2024.
Maintaining 21.04 means mainly fixing critical bugs and minor change requests. It doesn't involve merging with any future release of MySQL NDB Cluster, this will be handled in newer RonDB releases.
Backports of critical bug fixes from MySQL NDB Cluster will happen.
Summary of changes in RonDB 21.04.8#
RonDB has 3 bug fixes since RonDB 21.04.6 and two new features. In total RonDB 21.04 contains 17 new features on top of MySQL Cluster 8.0.23 and a total of 105 bug fixes.
RonDB uses four different ways of testing. MTR is a functional test framework built using SQL statements to test RonDB. The Autotest framework is specifically designed to test RonDB using the NDB API. The Autotest is mainly focused on testing high availability features and performs thousands of restarts using error injection as part of a full test suite run. Benchmark testing ensures that we maintain the throughput and latency that is unique to RonDB. Finally we also test RonDB in the Hopsworks environment where we perform both normal actions as well as many actions to manage the RonDB clusters.
RonDB has a number of unit tests that are executed as part of the build process to improve the performance of RonDB.
RonDB has a functional test suite using the MTR (MySQL Test Run) that executes more than 500 RonDB specific test programs. In adition there are thousands of test cases for the MySQL functionality. MTR is executed on both Mac OS X and Linux.
We also have a special mode of MTR testing where we can run with different versions of RonDB in the same cluster to verify our support of online software upgrade.
RonDB is very focused on high availability. This is tested using a test infrastructure we call Autotest. It contains also many hundreds of test variants that takes around 36 hours to execute the full set. One test run with Autotest uses a specific configuration of RonDB. We execute multiple such configurations varying the number of data nodes, the replication factor and the thread and memory setup.
An important part of this testing framework is that it uses error injection. This means that we can test exactly what will happen if we crash in very specific situations, if we run out of memory at specific points in the code and various ways of changing the timing by inserting small sleeps in critical paths of the code.
During one full test run of Autotest RonDB nodes are restarted thousands of times in all sorts of critical situations.
Autotest currently runs on Linux with a large variety of CPUs, Linux distributions and even on Windows using WSL 2 with Ubuntu.
We test RonDB using the Sysbench test suite, DBT2 (an open source variant of TPC-C), flexAsynch (an internal key-value benchmark), DBT3 (an open source variant of TPC-H) and finally YCSB (Yahoo Cloud Serving Benchmark).
The focus is on testing RonDBs LATS capabilities (low Latency, high Availability, high Throughput and scalable Storage).
Finally we also execute tests in Hopsworks to ensure that it works with HopsFS, the distributed file system built on top of RonDB, and HSFS, the Feature Store designed on top of RonDB, and together with all other use cases of RonDB in the Hopsworks framework.
Two new ndbinfo tables to check memory usage#
Two new ndbinfo tables are created, ndb$table_map and ndb$table_memory_usage. The ndb$table_memory_usage lists four properties for all table replicas, in_memory_bytes (the number of bytes used by a table fragment replica in DataMemory), free_in_memory_bytes (the number of bytes free of the previous, these bytes are always in the variable sized part), disk_memory_bytes (the number of bytes in the disk columns, essentially the number of extents allocated to the table fragment replica times the size of the extents in the tablespace), free_disk_memory_bytes (number of bytes free in the disk memory for disk columns).
Since each table fragment replica provides one row we will use a GROUP BY on table id and fragment id and the MAX of those columns to ensure we only have one row per table fragment.
We want to provide the memory usage in-memory and in disk memory per table or per database. However a table in RonDB is spread out in several tables. There are four places a table can use memory. First the table itself uses memory for rows and for a hash index, when disk columns are used this table also makes use of disk memory. Second there are ordered indexes that use memory for the index information. Thirdly there are unique indexes that use memory for rows in the unique index (a unique index is simply a table with unique key as primary key and primary key as columns) and the hash index for the unique index table. This table is not necessarily colocated with the table itself. Finally there is also BLOB tables that can contain hash index, row storage and even disk memory usage.
The user isn't particularly interested in this level of detail, so we want to display information about memory usage for tables and databases that the user sees. Thus we have to gather data for this, the tool to gather the data is the new ndbinfo table ndb$table_map, this table lists the table name and database name provided the table id, the table id can be the table id of a table, an ordered index, a unique index or a BLOB table, but will always present the name of the actual table defined by the user, not the name of the index table or BLOB table.
Using those two tables we create two ndbinfo views, the table_memory_usage listing the database name and table name and the above 4 properties for each table in the cluster. The second view, database_memory_usage lists the database name and the 4 properties summed over all table fragments in all tables created by RonDB for the user based on the BLOBs and indexes.
To make things a bit more efficient we keep track of all ordered indexes attached to a table internally in RonDB. Thus ndb$table_memory_usage will list memory usage of tables plus the ordered indexes on the table, there will be no rows presenting memory usage of an ordered index.
These two tables makes it easy for users to see how much memory they are using in a certain table or database. This is useful in managing a RonDB cluster.
Make it possible to use IPv4 sockets between ndbmtd and API nodes#
In MySQL NDB Cluster all sockets have been converted to use IPv6 format even when IPv4 is used. This led to that MySQL NDB Cluster no longer could interact with device drivers that only works using IPv4 sockets. This is the case for Dolphin SuperSockets.
Dolphin SuperSockets makes it possible to use extreme low latency HW in connecting the nodes in a cluster to improve latency significantly. RonDB has been tested and benchmarked using Dolphin SuperSockets.
RONDB-126: Fix neighbour handling#
There were some additional issues with neighbour node handling in transporters. Thus we disabled the neighbour node concept and now communication between nodes is handled equally for all nodes in the RonDB cluster.
As part of this work we added data validation code for each update in debug mode and error injection mode that are used in internal testing of RonDB.
Upgrade to using GCC 9 in release builds#
Builds are performed in a Docker container using Oracle Linux 7. To enable anyone to build their own binary tarball based on the source code we have also published our release building scripts. Most of those scripts are handled by a Dockerfile found in build_scripts/release_scripts directory in the distribution of RonDB.
RONDB-148: Increase stack size#
As an extra precaution against issues we decided to increase stack sizes of internal threads in the data nodes.
RONDB-134: Failed to activate node immediately after node failure#
An activate request could fail if handled too close to a node failure. To avoid this and make management code easier we ensured that we can wait up to 2 minutes during an activate request before reporting failure of the request. Also added more log printouts during these commands.
RONDB-131: Fixes of LQH_TRANSREQ protocol#
This fixes a serious bug introduced in RonDB 21.04.2 that leads to hanging node failures when there are more than 3 data nodes in the RonDB cluster.