Managing Management Servers#
The management server has the important task of containing the cluster configuration and passing this on to other nodes. These will always request it at startup via a network call - this is why rolling restarts are required with configuration changes.
Management servers themselves use the following 3 ways to load the configuration:
-
Network (using
--ndb-connectstring
) -
Configuration database (using
--configdir
) -
Configuration file (using
--config-file
)
For any configuration change, there must be unanimous consensus
amongst all management servers with NodeActive=1
. This means that if
one management server is down, no configuration changes can be made. Any
starting MGMd will also not be available until it knows that it is
part of the consensus. If one MGMd dies, another MGMd can however
continue forwarding configurations since it knows that it contains a
valid configuration and no change can be agreed without it.
Configuration Database#
Any new agreement made via the consensus will be persisted in the
configuration database. This is a binary file that is shared amongst
all active MGMds. It has the format
ndb_<node id>_config.bin.<sequence number>
, whereby the sequence
number is bumped for every configuration change.
Any start of an MGMd requires at least the argument where to read and write the configuration database. In our case:
If a configuration database exists and we are not using the --initial
flag, the config.ini
file will be ignored. When using the --initial
flag, the sequence number will be reset to 0 and previous configuration
databases will be removed.
Configuration Changes#
Changes to the configuration are either done by restarting a MGMd using
the --reload
flag or by using the management client ndb_mgm
.
Using the management client is particularly convenient, since it does not require a rolling cluster restart afterwards. As mentioned earlier, the management client supports
-
activating nodes
-
deactivating nodes
-
changing nodesβ hostnames (when inactive)
Using the --reload
flag is on the other hand needed if one wishes to:
-
Add, remove or change node slots (beyond activating / deactivating)
-
Add node groups
-
Change fields of the
[ndbd default]
or[tcp]
header
MGMd Lifecycle#
The following table shows the lifecycle of using multiple MGMds, whereby we run the following actions:
-
Move MGMds between hosts
-
Restart MGMds upon failure
-
Change the cluster configuration via the management client
-
Change the cluster configuration via the
--reload
flag
HOST_1 | HOST_2 | HOST_3 |
---|---|---|
Manual: Create config_v0.ini and replicate across hosts | ||
|
|
|
π Consensus v0 π | ||
β¬οΈβ¬οΈβ¬οΈ Rest of cluster starts β¬οΈβ¬οΈβ¬οΈ | ||
β MGMd crashes unexpectedly | ||
|
||
ndb_mgm -e "65 deactivate" |
||
π Consensus v1 π | ||
β¬οΈ MGMd goes down | ||
Manual: Persist change to config_v1.ini and replicate across hosts | ||
π Consensus v3 π
π Consensus v4 π Note: Edge-case of consensus without agreement |
||
Manual: Persist change to config_v4.ini and replicate across hosts | ||
|
||
π Consensus v4 π | ||
Manual: Change
TotalMemoryConfig in config_v5.ini and replicate across
hosts |
||
|
||
π Consensus v5 π | ||
πππ Rolling cluster restart πππ |
A thing that may become apparent is that persisting and replicating
changes to the config.ini file is very important. If any MGMd is ever
restarted with the --reload
parameter, it will use the config.ini
file. If the changes have not been persisted, a reload may use an
entirely outdated config.ini file, which could break the cluster.
Handling MGMd Machine Failures#
One may have noted that an irretrievable MGMd machine is problematic
since it blocks our ability to change the configuration. The only way
out of this situation is to escape consensus and start a new cluster
configuration sequence. This is done via the --initial
flag.
Continuing from the previous table, this situation can be handled as follows:
HOST_1 | HOST_2 | HOST_3 |
---|---|---|
β Host crashes unexpectedly | ||
Manual: Change
Hostname=HOST_1 for node 65 in config_v6.ini and replicate
across hosts |
||
|
|
|
|
|
|
π Consensus v0 π | ||
πππ Rolling cluster restart πππ |
This shows how we use an old configuration file v6 to start a new cluster sequence v0.
An issue with an irretrievable host is however that one may not know whether it is down or whether one has a network partition. One does not want two partitions running with different configurations.
Fortunately, RonDB uses an arbitrator to handle partitions. If a partitioned cluster has a minority of data nodes, they will simply fail directly. If both partitions contain 50% of data nodes, the first partition that contacts the arbitrator will survive.
Therefore, when using --initial
, one can first check whether any
data nodes are running. If so, one is in the winning partition and can
continue changing the configuration. If not, one leaves the partition
idle. To check whether any data nodes are running, one can run the
command ndb_mgm -e "show"
.
Adding a MGMd Slot#
In contrast to activating a MGMd node slot, adding a new MGMd node
slot will also require an --initial
restart of the live MGMd. This
conforms more to the consensus idea - the live MGMd should not be
available until it knows that the other MGMd has agreed to the
configuration.
If not wanting two active MGMds, we therefore still recommend adding one inactive MGMd slot. This will avoid rolling cluster restarts when moving the MGMd to another host.
Number of Running Management Servers#
When deciding the number of running management servers, one should take into account that:
-
Data nodes cannot start up without a running MGMd
-
Live MGMds require an initial restart if another MGMd host is down
-
An initial MGMd restart requires a rolling restart of the cluster
Setting up a cluster with multiple running management servers therefore has both pros and cons:
-
+ Data node process recovery is more stable
-
- The cluster is more likely to require rolling restarts once in a while