Managing Management Servers#

The management server has the important task of containing the cluster configuration and passing this on to other nodes. These will always request it at startup via a network call - this is why rolling restarts are required with configuration changes.

Management servers themselves use the following 3 ways to load the configuration:

Network (using --ndb-connectstring)
Configuration database (using --configdir)
Configuration file (using --config-file)

For any configuration change, there must be unanimous consensus amongst all management servers with NodeActive=1. This means that if one management server is down, no configuration changes can be made. Any starting MGMd will also not be available until it knows that it is part of the consensus. If one MGMd dies, another MGMd can however continue forwarding configurations since it knows that it contains a valid configuration and no change can be agreed without it.

Configuration Database#

Any new agreement made via the consensus will be persisted in the configuration database. This is a binary file that is shared amongst all active MGMds. It has the format ndb_<node id>_config.bin.<sequence number>, whereby the sequence number is bumped for every configuration change.

Any start of an MGMd requires at least the argument where to read and write the configuration database. In our case:

ndb_mgmd --configdir=/usr/local/rondb_data

If a configuration database exists and we are not using the --initial flag, the config.ini file will be ignored. When using the --initial flag, the sequence number will be reset to 0 and previous configuration databases will be removed.

Configuration Changes#

Changes to the configuration are either done by restarting a MGMd using the --reload flag or by using the management client ndb_mgm.

Using the management client is particularly convenient, since it does not require a rolling cluster restart afterwards. As mentioned earlier, the management client supports

activating nodes
deactivating nodes
changing nodes’ hostnames (when inactive)

Using the --reload flag is on the other hand needed if one wishes to:

Add, remove or change node slots (beyond activating / deactivating)
Add node groups
Change fields of the [ndbd default] or [tcp] header

MGMd Lifecycle#

The following table shows the lifecycle of using multiple MGMds, whereby we run the following actions:

Move MGMds between hosts
Restart MGMds upon failure
Change the cluster configuration via the management client
Change the cluster configuration via the --reload flag

HOST_1	HOST_2	HOST_3
Manual: Create config_v0.ini and replicate across hosts
`# Start with config.ini ndb_mgmd \ --ndb-nodeid=65 \ --configdir=<path> \ --config-file=config_v0.ini`	`# Start with network ndb_mgmd \ --ndb-nodeid=66 \ --configdir=<path> \ --ndb-connectstring=HOST_1`
🔄 Consensus v0 🔄
⬆️⬆️⬆️ Rest of cluster starts ⬆️⬆️⬆️
	❌ MGMd crashes unexpectedly
	`# Start with db ndb_mgmd \ --ndb-nodeid=66 \ --configdir=<path>`
	`ndb_mgm -e "65 deactivate"`
🔄 Consensus v1 🔄
⬇️ MGMd goes down
Manual: Persist change to config_v1.ini and replicate across hosts
	`ndb_mgm -e \ "65 hostname HOST_3"` 🔄 Consensus v3 🔄 `ndb_mgm -e \ "65 activate"` 🔄 Consensus v4 🔄 Note: Edge-case of consensus without agreement
Manual: Persist change to config_v4.ini and replicate across hosts
		`# Start with network ndb_mgmd \ --ndb-nodeid=65 \ --configdir=<path> \ --ndb-connectstring=HOST_2`
	🔄 Consensus v4 🔄
Manual: Change `TotalMemoryConfig` in config_v5.ini and replicate across hosts
		`kill <MGMd pid> # Start with config.ini ndb_mgmd \ --ndb-nodeid=65 \ --configdir=<path> \ --config-file=config_v5.ini \ --reload`
	🔄 Consensus v5 🔄
🔄🔄🔄 Rolling cluster restart 🔄🔄🔄

A thing that may become apparent is that persisting and replicating changes to the config.ini file is very important. If any MGMd is ever restarted with the --reload parameter, it will use the config.ini file. If the changes have not been persisted, a reload may use an entirely outdated config.ini file, which could break the cluster.

Handling MGMd Machine Failures#

One may have noted that an irretrievable MGMd machine is problematic since it blocks our ability to change the configuration. The only way out of this situation is to escape consensus and start a new cluster configuration sequence. This is done via the --initial flag.

Continuing from the previous table, this situation can be handled as follows:

HOST_1	HOST_2	HOST_3
		❌ Host crashes unexpectedly
Manual: Change `Hostname=HOST_1` for node 65 in config_v6.ini and replicate across hosts
`# Do this for safety rm -r <config_db_path>/*`	`kill <MGMd pid>`
`# Start with network ndb_mgmd \ --ndb-nodeid=65 \ --configdir=<path> \ --ndb-connectstring=HOST_2`	`# Start with config.ini ndb_mgmd \ --ndb-nodeid=66 \ --configdir=<path> \ --config-file=config_v6.ini \ --initial`
🔄 Consensus v0 🔄
🔄🔄🔄 Rolling cluster restart 🔄🔄🔄

This shows how we use an old configuration file v6 to start a new cluster sequence v0.

An issue with an irretrievable host is however that one may not know whether it is down or whether one has a network partition. One does not want two partitions running with different configurations.

Fortunately, RonDB uses an arbitrator to handle partitions. If a partitioned cluster has a minority of data nodes, they will simply fail directly. If both partitions contain 50% of data nodes, the first partition that contacts the arbitrator will survive.

Therefore, when using --initial, one can first check whether any data nodes are running. If so, one is in the winning partition and can continue changing the configuration. If not, one leaves the partition idle. To check whether any data nodes are running, one can run the command ndb_mgm -e "show".

Adding a MGMd Slot#

In contrast to activating a MGMd node slot, adding a new MGMd node slot will also require an --initial restart of the live MGMd. This conforms more to the consensus idea - the live MGMd should not be available until it knows that the other MGMd has agreed to the configuration.

If not wanting two active MGMds, we therefore still recommend adding one inactive MGMd slot. This will avoid rolling cluster restarts when moving the MGMd to another host.

Number of Running Management Servers#

When deciding the number of running management servers, one should take into account that:

Data nodes cannot start up without a running MGMd
Live MGMds require an initial restart if another MGMd host is down
An initial MGMd restart requires a rolling restart of the cluster

Setting up a cluster with multiple running management servers therefore has both pros and cons:

+ Data node process recovery is more stable
- The cluster is more likely to require rolling restarts once in a while