Skip to content

Stream to backup cluster#

This tutorial shows how to stream data from a primary RonDB cluster to a backup RonDB cluster using a single replication channel. Note this tutorial is based on the previous tutorial. It assumes that the primary cluster is running two MySQL replication servers (server_id=100 and server_id=101) and is therefore ready for replication. The steps to do so are covered in the previous tutorial.

Here, we will be running one MySQL replication server with server_id=200 and therefore only one replication channel. The following table shows the setup:

Primary cluster Backup cluster

server_id=100

Primary binlog server

Replication channel ➡️

server_id=200

Primary replica applier

server_id=101

Backup binlog server

The backup binlog server will however stay idle in this tutorial.

Supported cluster types#

With this tutorial, the following types of backup clusters can be created:

  • (One or more) Live-backup clusters

    Clusters that are only used to keep a live backup of the primary cluster. If the primary cluster fails and data does not persist, these clusters can be used to create a backup and restore the primary cluster.

  • (One or more) Read-only clusters

    Clusters that can be used for read-only operations. These can be used to offload read operations from the primary cluster. Technically they can also be used for writes, but these will not be replicated back to the primary cluster.

Note, however, that any type of backup cluster will not be highly available, since we’re only using one replication channel. Any failure of the replication channel will require us to run a backup and restore the backup cluster.

This tutorial is also not suitable for creating the following types of backup clusters:

  • Fail-over clusters

    Clusters that are used to take over as primary in case of a failure

  • Read-write clusters

    Clusters that can be used for writes and therefore require conflict-detection mechanisms

These types of backup clusters (and a highly available one) will be handled in subsequent tutorials.

Setting up backup cluster#

In the upcoming tutorials, we will have an active-active cluster setup, where the primary cluster also replicates the backup cluster using conflict detection. However, there we will add MySQL replication servers to the primary cluster which will act as replica appliers. This setup would look as follows:

Active cluster Active cluster
server_id=100 ➡️ server_id=200
server_id=101 ➡️ server_id=201
server_id=102 ⬅️ server_id=202
server_id=103 ⬅️ server_id=203

Whereby server_id=102 and server_id=103 are the replica appliers in the current primary cluster. This makes it easy to separate the roles and required configuration parameters of every MySQL replication server. Technically one could however also set up a single MySQL replication server as both binlog server and replica applier.

If one is planning to use the backup cluster for fail-over scenarios or in an active-active setup in the future, one can create it just like the primary cluster in the previous tutorial. The only difference would be to use server_id=202 and server_id=203 instead of server_id=100 and server_id=101. If one does so, the cluster will be running with two binlog servers. In this tutorial, these will be idle.

When setting up the backup cluster, one has a few possible scenarios:

  1. The primary cluster has created a backup

    Our cluster should be restored with this backup.

  2. The primary cluster was restored with a backup (and has not created one)

    Our cluster will need to be restored with the same backup.

  3. The primary cluster has not restored or created a backup

    This assumes that all data is available as binlogs in the primary cluster. Our cluster can be started from scratch.

Whenever there is a backup involved, one can follow the steps here. Most importantly:

  • Using ndb_restore --no-binlog ensures that the restore is not unnecessarily logged to any binlogs. This is particularly important if we want our backup cluster to also replicate back to the primary cluster in the future.

  • Using ndb_restore --restore_epoch ensures that the ndb_apply_status table contains a row with server_id=0 and the epoch number from which the backup was taken. This is important if the backup was created by the primary cluster and we need to find the binlog file and position to start replicating from. This is explained further on.

If the backup cluster restores a backup, it is important to quickly start replicating from the binlog servers. If one waits too long, the binlogs in the primary cluster may run out of disk space and the primary cluster will be forced to create a new backup. In this case, we have to restart our backup cluster and restore the new backup.

In any scenario, the replica applier(s) can be started last in the backup cluster. As listed previously, the configuration parameters to start the replica applier are as follows:

mysqld \
    --server-id=200 \
    --ndbcluster \
    --skip-replica-start \
    --replica-allow-batching

Creating a MySQL user for replication#

Given that the replica applier needs to connect to the source MySQL server, it is necessary to set a special replication user on the source MySQL Server. This is recommended for security reasons.

CREATE USER 'repl_user'@'HOSTNAME' IDENTIFIED WITH mysql_native_password BY 'repl_password';

-- Replicate user to data nodes:
GRANT NDB_STORED_USER ON *.* TO 'repl_user'@'HOSTNAME';

GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'HOSTNAME';

-- Optional:
GRANT REPLICATION CLIENT ON *.* TO 'repl_user'@'HOSTNAME';
GRANT REPLICATION_SLAVE_ADMIN ON *.* TO 'repl_user'@'HOSTNAME';

The HOSTNAME should be the hostname or IP address of the replica applier. Technically, one can create one MySQL user per binlog server. However, this may complicate the setup.

Finding the binlog position to start replication#

Meanwhile, the binlog server in the primary cluster has been writing transactions to its binlog files for every epoch. The replica applier must now know from which binlog file and which position in it it should start replicating. We will call these variables BINLOG_FILE and BINLOG_POSITION. When restoring a backup is involved, knowing the epoch number from which the backup was taken can help.

Epochs in RonDB are independent between clusters. Restoring a backup in a cluster will not affect its epoch number. However, a backup in RonDB will still contain the epoch number from the cluster it originated from. When restoring the backup using ndb_restore --restore_epoch, a row is inserted into the ndb_apply_status table with server_id=0 and this epoch number.

The naming of the binlog files is defined by the log-bin parameter in the MySQL configuration file of the binlog server. The first binlog file will be <log-bin>.000001. The minimum position in a binlog file is 4.

When searching our binlog file and position, we again have the scenarios mentioned earlier:

  • The primary cluster has not restored or created a backup

    Since we can assume here that all data is available as binlogs in the primary cluster, our variables are BINLOG_FILE=<log-bin>.000001 and BINLOG_POSITION=4.

  • The primary cluster has created a backup

    This means that we will have restored the backup cluster from the backup created by the primary cluster. Thereby, we can retrieve the epoch number of the backup by running this on the replica applier:

    SELECT MAX(epoch) AS restore_epoch from mysql.ndb_apply_status WHERE server_id = 0;
    

    We store the result in the shell variable $RESTORE_EPOCH.

    Epochs have the format <32-bit GCI>/<32-bit micro-GCI>. The epoch number restored by the backup is a pseudo-epoch whereby the micro-GCI is set to 0xffffffff. Therefore, this pseudo-epoch will be higher than any real epoch with the same GCI and lower than any real epoch with a higher GCI.

    Since this epoch number originated from the primary cluster, we can use it to query the binlog file and position from the ndb_binlog_index table in the binlog server. However, due to the pseudo-epoch, we cannot use the equal operator on the epoch in the query. Furthermore, it may not be entirely clear what binlog files the primary cluster has available. The entire issue at hand is described by Frazer Clement in MySQL bug report 110785. Therefore, we use the following, more complex query:

    SET @file = NULL;
    SET @position = NULL;
    
    -- [Assumption] We have post-backup binlog files.
    --              Data has been written to the primary cluster since the backup.
    -- [Action]     Go to the first epoch after the pseudo epoch and start from the
    --              position of this epoch:
    SELECT SUBSTRING_INDEX(File, '/', -1) AS file, 
        position AS position
    INTO @file, @position
    FROM mysql.ndb_binlog_index
    WHERE epoch > $RESTORE_EPOCH
    ORDER BY epoch ASC
    LIMIT 1;
    
    IF @file IS NULL THEN
        -- [Conclusion] No data has been written to the primary cluster since the
        --              backup.
        -- [Assumption] We have some pre-backup binlog files.
        --              The binlog files have not been purged since the backup.
        -- [Action]     Find the epoch just before the pseudo-epoch and get the binlog
        --              file and position of the *next* epoch:
        SELECT SUBSTRING_INDEX(next_file, '/', -1) AS file,
                next_position AS position
        INTO @file, @position
        FROM mysql.ndb_binlog_index
        WHERE epoch < $RESTORE_EPOCH
        ORDER BY epoch DESC
        LIMIT 1;
    
        -- A base assumption here is that the backup was taken by the primary cluster itself.
        -- Therefore, the cluster should have created binlog files. The command to purge binlog
        -- files will only purge them *until* the current binlog file. Therefore, we can assume
        -- that there is *at least one* binlog file after the backup.
    END IF;
    
    SELECT @file AS file, @position AS position;
    

    Note that at this point, we will have two binlog servers running in the primary cluster. These are not synchronized in the binlog index positions, so this query must be issued to the binlog server one wants to replicate from.

  • The primary cluster was restored with a backup (and has not created one)

    As mentioned earlier, this means that we will have restored the backup cluster from the same backup as the primary cluster. However, this represents a tricky case, since the epoch number contained in the backup will not reference an epoch in the primary cluster. The epoch number generated by ndb_restore --restore_epoch will therefore be purposeless.

    Instead, we can assume that the primary cluster started its binlog from scratch following the restore. Therefore, our variables are once again BINLOG_FILE=<log-bin>.000001 and BINLOG_POSITION=4.

Setting up replication channel#

A good practice before setting up the replication channel is to reset the replication channel settings on the replica applier. We do this to ensure that no previous replica settings are interfering:

RESET REPLICA;

Now, we can configure the replication channel on the replica applier. The following command must be executed on it:

CHANGE REPLICATION SOURCE TO
    -- Hostname of source MySQL Server
    SOURCE_HOST='source_host',
    -- Port number of the source MySQL Server
    SOURCE_PORT=3306,
    -- Username that the replica will use on the source MySQL Server
    SOURCE_USER='repl_user',
    -- Password of this user on the source MySQL Server
    SOURCE_PASSWORD='repl_password',
    -- The starting binlog file on the source MySQL Server
    SOURCE_LOG_FILE='$BINLOG_FILE',
    -- The starting position on the source MySQL Server
    SOURCE_LOG_POS=$BINLOG_POSITION,
    -- Ignore replication from server IDs in backup cluster to avoid circular replication
    IGNORE_SERVER_IDS=(202,203);

This command does not start replication yet - therefore, it is safe to run it on all MySQL replication servers of the backup cluster. Note however that one may want to have each MySQL replication server in the backup cluster connect to a different source MySQL Server. This is covered in the next tutorial, where we stream with high availability.

There is also a set of options in the CHANGE REPLICATION SOURCE TO command to set up a secure communication between the replica and the source using SSL options.

Avoiding circular replication (IGNORE_SERVER_IDS)#

The IGNORE_SERVER_IDS parameter is optional but prepares the setup for the upcoming tutorials, where both clusters will be replicating each other (active-active). There, the primary cluster will also apply records that originated in the backup cluster. The backup cluster will then have two binlog servers with server IDs 202 and 203. Using this parameter will prevent the replica appliers in the backup cluster from pulling back these changes. Therefore, circular replication is stopped.

The alternative is to set log-replica-updates=OFF on all binlog servers (primary and secondary cluster). This will stop the binlog servers from logging writes coming from their local replica applier in the first place. Circular replication is then also stopped. Using the default (log-replica-updates=ON) can be useful if a third cluster wants to replicate from our backup cluster. This cluster then does not need to merge binlogs from our primary and backup cluster. All binlogs will be in one place.

Start replica applier#

Now we are prepared to start our replication channel. We execute this command in the replica applier:

START REPLICA;

The replication channel will start and any transactions executed in the primary cluster will be applied in the backup cluster through the replica applier.

After starting the replica applier it can take some time before the backup cluster has caught up with the primary cluster. The backup cluster will never entirely catch up since the replication is asynchronous. However, the replication lag can be reduced to seconds. If the backup cluster had first restored a backup, the replication lag will start with all the data that was written to primary cluster in the meantime. For larger clusters, it can take hours to reduce this replication lag to seconds.

Restarting replica applier#

In contrast to the initial start of the replica applier, we will now most likely already have replicated data in the backup cluster. This means that one cannot any longer use BINLOG_FILE=<log-bin>.000001 and BINLOG_POSITION=4 or the epoch number from the backup to find the binlog file and position to start from. Therefore, when checking for the latest applied epoch in the backup cluster, we use the following query:

SELECT @latest:= MAX(epoch) FROM mysql.ndb_apply_status WHERE server_id != 0;

We specifically set server_id != 0 since both the primary and the backup cluster may have been restored from the same backup. Since this backup would originate from an entirely different cluster, its epoch number could be higher than the primary’s current epoch number.

Since we have now replicated data, the latest applied epoch will also not be a pseudo-epoch any longer. Therefore, we could now use the equals operator in the query to find the binlog file and position to start from. However, if the MySQL parameter --ndb-log-empty-epochs was used, some empty epochs may not be listed in the ndb_binlog_index table. To safeguard against this, we use <= and select the highest epoch:

SELECT SUBSTRING_INDEX(next_file, '/', -1), next_position
    FROM mysql.ndb_binlog_index
    WHERE epoch <= @latest
    ORDER BY epoch DESC LIMIT 1;

An example of this might be the following:

mysql> SELECT epoch,
            SUBSTRING_INDEX(next_file, '/', -1) AS file,
            next_position AS position
        FROM mysql.ndb_binlog_index
        WHERE epoch <= 3182570766345
        ORDER BY epoch DESC LIMIT 4;
+---------------+---------------+----------+
| epoch         | file          | position |
+---------------+---------------+----------+
| 3182570766345 | binlog.000005 |   107493 |
| 3182570766344 | binlog.000005 |   107007 |
| 3173980831744 | binlog.000005 |   106521 |
| 3169685864457 | binlog.000005 |   106035 |
+---------------+---------------+----------+
4 rows in set (0.01 sec)

As previously, we use the outputs to fix the replication source again:

CHANGE REPLICATION SOURCE TO
    SOURCE_LOG_FILE='$BINLOG_FILE',
    SOURCE_LOG_POS=$BINLOG_POSITION;

and finally, start the replica applier again:

START REPLICA;

All of these commands can also be used when spawning new replica appliers in the backup cluster. This is covered in the next tutorial.

Purging MySQL Replication files#

In the previous tutorial, binlog files were always purged after a backup was taken. Here, however, we must first wait for all backup clusters to have applied the binlog files before purging them. The advantage of this is that one does not need to create backups to purge binlogs anymore.

Identify purgeable binlog files#

To calculate which files to delete at any point in time, one can issue the following query toward the ndb_apply_status table in each backup cluster. This deduces what epoch has been applied in the backup cluster:

SELECT @latest:= MAX(epoch) FROM mysql.ndb_apply_status WHERE server_id != 0;

Note that this epoch number will be an epoch number from the primary cluster. In case we have multiple backup clusters running in the future, we now take the minimum of all the @latest from all backup clusters (denoted by @minimum_latest). As explained in the previous tutorial, this epoch number can now be used to identify which binlog files can be deleted:

SELECT SUBSTRING_INDEX(next_file, '/', -1)
    FROM mysql.ndb_binlog_index
    WHERE epoch <= $BACKUP_EPOCH>
    ORDER BY epoch DESC LIMIT 1;

Again, we save this to $BINLOG_FILE and run:

PURGE BINARY LOGS TO $BINLOG_FILE;

Handling data gaps#

If the backup cluster is used for reads or fail-over scenarios, the demanded availability of it will be high. Having to restart it to restore from a backup should then be avoided if possible. As discussed in the previous tutorial, this will happen if the primary cluster generates data gaps.

One solution which was not reasonable in the previous tutorial but can be now, is to scale the binlog servers horizontally or vertically during periods of very high write rates. This will reduce the likelihood of having to create a backup and restart the backup cluster with it.