Export data#
The main method for exporting data out of RonDB is the backup feature. Backup is the only method to export data from RonDB in a consistent manner. A backup consists of two sets of files. The first set of files is containing the rows scanned from RonDB. The second set of files contains REDO or UNDO logs that makes the backup consistent to the starting point of the backup (UNDO logs used) or the point in time when the backup is completed (REDO logs used).
Backup was mainly designed to provide a method for restoring the data in RonDB, in either the same cluster or in another cluster. It is an important tool in setting up replication between clusters, it is a very important tool to restore a cluster after failures due to corrupted data.
The data in the backup files can be used to restore a database exactly as it looked at the time of the completion or start of the backup.
It is straightforward to restore it to another RonDB instance or the same RonDB instance. To restore to another database such as InnoDB, Oracle and so forth is a bit less straightforward.
ndb_restore can be used to output the data in CSV format, it can also be used to output the row changes as SQL statements. There are a few obstacles. The backup takes a backup with data in the same representation as in RonDB. This is not a problem in most cases, but it does represent an issue when dealing with BLOBs. BLOBs store the first 256 bytes of the data in the main table. This data is represented correctly. The remainder of the BLOB data is stored in a separate BLOB table with up to 2048 bytes per row in this table. There is one BLOB table per BLOB column.
It is a challenge when exporting data from RonDB to another DBMS to piece together the BLOBs again. In addition the change logs cannot be represented in CSV format since they represent inserts, deletes and updates. They can be represented in the SQL format that ndb_restore can also produce.
The conclusion is that transporting data consistently to another DBMS requires that the data isn’t updated during the export process. In this case we can use the mysqldump and we can use SELECT ... INTO OUTFILE.
When transporting data to another RonDB cluster, backup is the preferred method.
mysqldump#
One method of exporting data from RonDB (and from any other MySQL storage engine) is to use the mysqldump tool. The problem in using this tool for RonDB is that it doesn’t generate a consistent data point. To be able to export data properly using mysqldump it is important that RonDB isn’t allowed to be updated while performing the mysqldump if a consistent data point is required in the export.
mysqldump can generate its export format as SQL statements, text files (CSV files) or as XML files. mysqldump can be used to transport data from any MySQL installation into any other DBMS including other MySQL instances.
mysqldump can be used to export data from MySQL/InnoDB and import it into RonDB. In this case it is important to NOT use the option --single-transaction. This will create very large transactions which can cause the import to fail due to lack of resources for extremely large transactions.
To dump data into a dump file is straightforward, to dump one database one uses the following commands. The first command generates an SQL file with the CREATE TABLE statements required to create the tables in the database, it also generates the CREATE DATABASE command for the database exported.
mysqldump --no-data --databases \
database_name > database_name_tables.sql
mysqldump --no-create-info --no-create-database \
--databases database_name > database_name_data.sql
The default output from mysqldump is SQL INSERT statements. CSV files can be produced using the --tab=DIRNAME parameter. It is also possible to provide the output in XML format using the --xml parameter.
SELECT INTO OUTFILE#
Another method of exporting data from RonDB and from any other MySQL storage engine is to use the SQL command SELECT ... INTO OUTFILE. This command can select the full or parts of a table and export it into a CSV file. The command can be used with any SELECT query. For exporting data the normal use case would be the command.
This command will not produce a consistent data point for RonDB, for InnoDB it will, also for InnoDB one have to ensure that no updates are performed when using this method to export data since there will be no consistency between different tables. Each table will require its own file.
Using this method one file per table will be produced. It is fairly straightforward to parallelise the production of these files by executing several such commands for different tables in parallel.
The command SHOW DATABASES can be used to discover all databases in MySQL. For each database one can then use SHOW TABLES to get all the tables in MySQL. For RonDB all tables in the cluster will be listed in this manner.