MySQL Cluster backup methods

What gives MySQL Cluster the potential of being called a cluster is the network database (NDB) storage engine. NDB’s nature of a storage engine gives us a sometimes confusing flexibility to choose which table to store locally and which to span and/or mirror along the cluster’s data nodes. This also generates a duality when creating backups of MySQL Cluster databases. NDB bears its native backup method that we utilize for data contained in the cluster. The remainder is saved as logical dump of database structure and records.

NDB backup method

Data and meta data which is held by data nodes of a live NDB cluster can be saved using the native backup functionality of NDB. This is a strict and simple method, that can be initiated from any member of the cluster with the management client.

/usr/bin/ndb_mgm -e "START BACKUP <id>"

where ‘id’ is an integer that identifies the snapshot to be created – it has to be specified on non-interactive runs with the -e parameter.

When this command is issued, every ndbd daemon on the data nodes creates a snapshot of its locally stored database segment. The snapshot is stored in the local file system, under the backup directory path configured on the management node (the default path is /usr/local/mysql/data/BACKUP). For every snapshot a directory is created to contain its binary dump files with the name ‘BACKUP-‘.

A cron job can be defined on e.g. the management node based on the above command to generate regular snapshots.

01 * * * * root /bin/bash -c "/usr/bin/ndb_mgm -e \"START BACKUP `date +\%s`\""

Note, that the current Unix time stamp is used as backup id. Taking snapshots too frequently can consume a large amount of disk space, as these are full dumps.

Logical dumps

On a live MySQL community DBMS, there’s basically one way to do this – via the good old mysqldump binary. In our example environment a shell script invokes it for all databases which contain at least one table with non-ndbcluster and non-memory engine, except for information_schema, ndbinfo, performance_schema.

#!/bin/bash
DUMPDIR=/var/lib/mysql/DUMP
[ -d "$DUMPDIR" ] || mkdir -p $DUMPDIR || exit 3
TIMESTAMP=`/bin/date +%s`
cat /etc/mysql/backup.sql |
/usr/bin/mysql --defaults-extra-file=/etc/mysql/backup.cnf |
while read db; do
  /usr/bin/mysqldump --defaults-extra-file=/etc/mysql/backup.cnf -r $DUMPDIR/${TIMESTAMP}_$db.sql $db
done

The databases are selected from information_schema with the following query. If you also hold the mysql database on NDB, then you can add it to the list of exclusions.

USE information_schema;
SELECT
  TABLE_SCHEMA
FROM
  TABLES
WHERE
  ENGINE NOT IN ('ndbcluster','memory') AND
  TABLE_SCHEMA NOT IN ('information_schema','ndbinfo','performance_schema')
GROUP BY
  ENGINE,TABLE_SCHEMA;

backup.cnf is a custom MySQL configuration file that holds all options for mysql client binary and mysqldump, specific to this backup procedure.

[client]
user = backup
password = 

[mysql]
batch
skip-column-names

[mysqldump]
single-transaction
quote-names
routines
triggers
events
force

A user needs to be created in the MySQL server which has enough privileges to perform the dump.

CREATE USER backup@localhost IDENTIFIED BY '<password>';
GRANT SELECT, SHOW VIEW, LOCK TABLES, EVENT, TRIGGER ON *.* TO backup@localhost;

Restoration of MySQL data

Obviously different methods should be used when restoring NDB and non-NDB data. Data stored in an NDB cluster can be recovered via the ndb_restore utility, while for everything else the mysql client program is used. Because there are plenty of options and they can be combined in several ways, the steps are described in a rather general manner.

Restoring NDB tables and databases

Though NDB backup is initiated with a single command execution, snapshots are created separately on every node. When recovering data from those backups, ndb_restore must be executed once for each backup file – in other words, once for every data node in the cluster at the time that the backup was taken. To avoid inconsistencies, recommendation is to enable single user mode on the cluster before commencing, which guarantees that for the time of recovery only the ndb_backup process has access to the data nodes. Single user mode can be avoided though, if you can limit access to the database by other means.

/usr/bin/ndb_mgm -e "ENTER SINGLE USER MODE <ndb_node_id>"

ndb_backup runs as an API node with its own node_id, which can be defined upon command execution. This also suggests that a vacant API/mysqld slot is needed in the NDB configuration (an empty [api] section in config.ini should do).

The official documentation suggests that running ndb-restore requires an empty target table or database. When our intention is to restore all databases from dump, then the best way to achieve this is by starting ndbd with the –initial parameter. This destroys all data on the node. Otherwise we only need to truncate the tables or destroy the databases which are involved.

When whole tables or databases need to be recovered along with their metadata, below command has to be executed once for a single data node. The structures will then be synchronized on the others.

/usr/bin/ndb_restore \
[-c ] \
-n \
-b \
--backup_path= \
--ndb-nodeid= \
[--include-tables=] \
[--include-databases=] \
[--exclude-tables=] \
[--exclude-databases=] \
-m

Where

  • connection_string is the IP address or hostname of the ndb_mgmd to connect to. This option can be omitted when running the command on the management node.
  • data_node_id is the ID of the data node on which restoration should be performed
  • backup_id is the numeric identifier of the snapshot from which files should be recovered.
  • backup_path is the path to the backup directory. This must be the directory to which backup files were restored in step 1.
  • ndb-nodeid is the same node_id that was mentioned above when discussing single user mode.
  • table_list, db_list contain the name of tables and/or databases (delimited by commas), which should or should not be restored from dump:
    when none is given, everything is restored;
    when only exclude is given, all but the named objects are restored;
    when only include is given, only the named objects are restored

Bellow command recovers only the data records from the dump. Execute it once for every data node (data_node_id) in the cluster.

/usr/bin/ndb_restore \
[-c ] \
-n \
-b \
--backup_path= \
--ndb-nodeid= \
[--include-tables=] \
[--include-databases=] \
[--exclude-tables=] \
[--exclude-databases=] \
-r

See description of arguments above.

To leave single user mode after recovery, execute the following command. If something goes wrong, restarting the data nodes will also bring them back to normal mode.

/usr/bin/ndb_mgm -e “EXIT SINGLE USER MODE”

When some of the above requirements (e.g. limiting access, destruction of existing data) are unacceptable, ndb_restore can be used with the –print* or –tab parameters to extract the content of dumps, so that data can be manually processed and inserted into a live database.

Restoring from logical dumps

Logical dump files are SQL scripts which recreate all tables of a single database including data records. These should be fed to the MySQL command line client through its standard input. It is advisable to run this command as root, because an extensive range of privileges are required. Existing data is destroyed as the tables are dropped and recreated by the scripts. When you need a non-destructive way of recovery, restore to a temporary database and insert selected records manually.

/usr/bin/mysql -u <user_name> -p <db_name> < <dump_file>

Where

  • user_name is the MySQL user to connect with
  • db_name is the name of the database to use
  • dump_file is the name and path of a database dump

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s