Loading...
 

MySQLReplication

Setting up replication on MySQL

One of the difficulties with a large and active MySQL database is making clean backups without having to bring the server down. Otherwise, a backup may slow down the system and there may be inconsistency with data, since related tables may be changed while another is being backed up. Taking the server down will ensure consistency of data, but it means interruption of service to users. Sometimes this is necessary and unavoidable, but daily server outages for backing up data may be unacceptable. A simple alternative method to ensure reliable backups without having to shut down the server daily is to set up replication for MySQL.

Typically, replication is a system configuration whereby the MySQL server, known in this context as a master server, houses the data and handles client requests, while another MySQL server (a slave server) contains a complete copy of the data and duplicates all SQL statements in which data is changed on the master server right after it happens. There are several uses for replication (e.g., load balancing), but the concern of this article has to do with using replication for data backups. You can set up a separate server to be a slave and then once a day turn replication off to make a clean backup. When you’re done, replication can be restarted and the slave will automatically query the master for the changes to the data that it missed while it was offline. Replication is an excellent feature and it’s part of MySQL. You just need to set it up.

The Replication Process


Before explaining how to set up replication, let me quickly explain the steps that MySQL goes through to maintain a replicated server. The process is different depending on the version of MySQL. For purposes of this article, my comments will be for version 4.0 or higher, since most systems now are using the later versions.

When replication is running, basically, as SQL statements are executed on the master server, MySQL records them in a binary log (bin.log) along with a log position identification number. The slave server in turn, through an IO thread, regularly and very often reads the master’s binary log for any changes. If it finds a change, it copies the new statements to its relay log (relay.log). It then records the new position identification number in a file (master.info) on the slave server. The slave then goes back to checking the master binary log, using the same IO thread. When the slave server detects a change to its relay log, through an SQL thread the slave executes the new SQL statement recorded in the relay log. As a safeguard, the slave also queries the master server through the SQL thread to compare its data with the master’s data. If the comparison shows inconsistency, the replication process is stopped and an error message is recorded in the slave’s error log (error.log). If the results of the query match, the new log position identification number is recorded in a file on the slave (relay-log.info) and the slave waits for another change to the relay log file.

This process may seem involved and complicated at first glance, but it all occurs quickly, it isn’t a significant drain on the master server, and it ensures reliable replication. Also, it’s surprisingly easy to set up. It only requires a few lines of options to be added to the configuration file (i.e., my.cnf) on the master and slave servers. If you’re dealing with a new server, you’ll need to copy the databases on the master server to the slave to get it caught up. Then it’s merely a matter of starting the slave for it to begin replicating.

http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-replication.html

For replication, the binary log is used on master replication servers as a record of the statements to be sent to slave servers. Many details of binary log format and handling are specific to this purpose. The master server sends the events contained in its binary log to its slaves, which execute those events to make the same data changes that were made on the master. A slave stores events received from the master in its relay log until they can be executed. The relay log has the same format as the binary log.

Certain data recovery operations require use of the binary log. After a backup file has been restored, the events in the binary log that were recorded after the backup was made are re-executed. These events bring databases up to date from the point of the backup.

Types of binary logging

Statement-based logging: Events contain SQL statements that produce data changes (inserts, updates, deletes)
Row-based logging: Events describe changes to individual rows. Mixed logging uses statement-based logging by default but switches to row-based logging automatically as necessary. Row-based (and mixed) logging is available beginning with MySQL 5.1.

The mysqlbinlog utility can be used to print binary or relay log contents in readable form.

Master Configuration
Edit my.cnf:

#skip-networking
#bind-address		=	127.0.0.1
log-bin  =  /var/log/mysql/mysql-master-bin.log
binlog-do-db	= database1
binlog-do-db	= database2
server-id  =  1  # must be unique on both servers; and each server can have only one id


Restart MySQL
Log into mysql as root and create user privileges for replication:

GRANT REPLICATION SLAVE ON *.* TO 'username'@'%' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;


For pre mysqld v4

GRANT FILE ON *.* TO 'username'@'*' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;


Record the current position of the database so that replication on the slave knows where to begin (write the values down):

FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;


Quit out of the database. Perform a mysqldump on the database and move the dump to the slave:

$ mysqldump -u root -p --extended-insert --add-drop-database --master-data --databases  database1 database2 > backup.sql
--add-drop-database = This will add a drop database line before the create<
--master-data = adds a change master line – only needed when copying from a master to a slave
--databases can also be substituted with --all-databases


Slave Configuration
Edit my.cnf

server-id = 2
master-host = master_address
master-port = 3306
master-user =  user
master-password =  pass
replicate-do-db	= database1
replicate-do-db	= database2
log-bin = /var/log/mysql/bin.log
log-bin-index = /var/log/mysql/log-bin.index
log-error = /var/log/mysql/error.log
relay-log = /var/log/mysql/relay.log
relay-log-info-file = /var/log/mysql/relay-log.info
relay-log-index = /var/log/mysql/relay-log.index


Restart mysql and load the dump file into mysql:

$ mysql -u root -p  <  backup.sql


Run the change master command; (Don’t think this is necessary – at least on mysql5)

CHANGE MASTER TO MASTER_HOST='10.44.20.61', MASTER_USER='replicate', MASTER_PASSWORD='replicate', MASTER_LOG_FILE='<file>', MASTER_LOG_POS=<pos>;

  1. Log into mysql and start the slave:
    START SLAVE;
  2. Check the slave is backing up and connected:
    SHOW SLAVE STATUS;
  3. On the master:
    FLUSH TABLES WITH READ LOCK;
    UNLOCK TABLES;
    SHOW PROCESSLIST;

PROBLEMS

setting “log = /var/log/mysql/debug.log” in my.cnf will give more debugging info
If the replication username or password in my.cnf have changed on the slave, then:

090827  4:25:07   20 Connect    Access denied for user 'replicate'@'dbslave.tmc.uk.intranet' (using password: YES)

Could be seen in the debug log.
This is caused by the original settings being stored in /var/db/mysql/master.info on the slave. Removing this file and restarting the slave will accept the new settings from my.cnf




The Replication User


There are only a few steps to setting up replication. The first step is to set up a user account to use only for replication. It’s best not to use an existing account for security reasons. To do this, enter an SQL statement like the following on the master server, logged in as root or a user that has GRANT OPTIONprivileges:

GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'replicant'@'slave_host IDENTIFIED BY 'my_pwd';


In this SQL statement, the user account replicantis granted only what’s needed for replication. The user name can be almost anything. The host name (or IP address) is given in quotes. You should enter this same statement on the slave server with the same user name and password, but with the master’s host name or IP address. This way, if the master fails and will be down for a while, you could redirect users to the slave with DNS or by some other method. When the master is back up, you can then use replication to get it up to date by temporarily making it a slave to the former slave server. Incidentally, if you upgraded MySQL to version 4.0 recently, but didn’t upgrade your mysqldatabase, the GRANTstatement above won’t work because these privileges didn’t exist in the earlier versions. For information on fixing this problem, see MySQL’s documentation on Upgrading the Grants Tables.

Configuring the Servers


Once the replication user is set up on both servers, we will need to add some lines to the MySQL configuration file on the master and on the slave server. Depending on the type of operating system, the file will probably be called my.cnfor my.ini. On Unix-type systems, the configuration file is usually located in the /etcdirectory. On Windows systems, it’s usually located in c:\or in c:\Windows. Using a text editor, add the following lines to the configuration file, under the mysqldgroup heading:

server-id = 1
log-bin = /var/log/mysql/bin.log


The server identification number is an arbitrary number to identify the master server. Almost any whole number is fine. A different one should be assigned to the slave server to keep them straight. The second line above instructs MySQL to perform binary logging to the path and file given. The actual path and file name is mostly up to you. Just be sure that the directory exists and the user mysqlis the owner, or at least has permission to write to the directory. Also, for the file name use the suffix of “.log” as shown here. It will be replaced automatically with an index number (e.g., “.000001”) as new log files are created when the server is restarted or the logs are flushed.

For the slave server, we will need to add a few more lines to the configuration file. We’ll have to provide information on connecting to the master server, as well as more log file options. We would add lines similar to the following to the slave’s configuration file:

server-id = 2
master-host = mastersite.com
master-port = 3306
master-user = replicant
master-password = my_pwd
log-bin = /var/log/mysql/bin.log
log-bin-index = /var/log/mysql/log-bin.index
log-error = /var/log/mysql/error.log
relay-log = /var/log/mysql/relay.log
relay-log-info-file = /var/log/mysql/relay-log.info
relay-log-index = /var/log/mysql/relay-log.index



This may seem like a lot, but it’s pretty straightforward once you pick it apart. The first line is the identification number for the slave server. If you set up more than one slave server, give them each a different number. If you’re only using replication for backing up your data, though, you probably won’t need more than one slave server. The next set of lines provides information on the master server: the host name as shown here, or the IP address of the master may be given. Next, the port to use is given. Port 3306 is the default port for MySQL, but another could be used for performance or security considerations. The next two lines provide the user name and password for logging into the master server.

The last two stanzas above set up logging. The second to last stanza starts binary logging as we did on the master server, but this time on the slave. This is the log that can be used to allow the master and the slave to reverse roles, as mentioned earlier. The binary log index file (log-bin.index) is for recording the name of the current binary log file to use. As the server is restarted or the logs are flushed, the current log file changes and its name is recorded here. The log-erroroption establishes an error log. If you don’t already have this set up, you should, since it’s where any problems with replication will be recorded. The last stanza establishes the relay log and related files mentioned earlier. The relay log makes a copy of each entry in the master server’s binary log for performance’s sake, the relay-log-info-fileoption names the file where the slave’s position in the master’s binary log will be noted, and the relay log index file is for keeping track of the name of the current relay log file to use for replicating.

Copying Databases and Starting Replication


If you’re setting up a new master server that doesn’t contain data, then there’s nothing left to do but restart the slave server. However, if you’re setting up replication with an existing server that already has data on it, you will need to make an initial backup of the databases and copy it to the slave server. There are many methods to do this; for our examples, we’ll use the utility mysqldumpto make a backup while the server is running. However, there’s still the problem with attaining consistency of data on an active server. Considering the fact that once you set up replication you may never have to shut down your server for backups again, it might be worth while at least to lock the users out this one last time to get a clean, consistent backup. To run the master server so that only roothas access, we can reset the variable max_connectionslike so:

SHOW VARIABLES LIKE 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 100   |
+-----------------+-------+
SET GLOBAL max_connections = 0;



The first SQL statement isn’t necessary, but we may want to know the initial value of the max_connectionsvariable so that we can change it back when the backup is finished. Although setting the variable to a value of 0suggests that no connections are allowed, one connection is actually reserved for the rootuser. Of course, this will only prevent any new connections. To see if there are any connections still running, enter SHOW PROCESSLIST;. To terminate any active processes, you can use the KILLstatement.

With exclusive access to the server, using mysqldumpis usually very quick. We would enter the following from the command line on the master server:

$ mysqldump --user=root --password=my_pwd --extended-insert --all-databases --master-data  > /tmp/backup.sql



This will create a text file containing SQL statements to create all of the databases and tables with data. The --extended-insert+- option will create multiple-row -+INSERTstatements and thereby allow the backup to run faster, for the least amount of down time or drain on services. The --master-dataoption above locks all of the tables during the dump to prevent data from being changed, but allows users to continue reading the tables. With exclusive access, this feature isn’t necessary. However, this option also adds a few lines like the following to the end of the dump file:

--
-- Position to start replication from
--
CHANGE MASTER TO MASTER_LOG_FILE='bin.000846';
CHANGE MASTER TO MASTER_LOG_POS=427;


When the dump file is executed on the slave server, these last lines will record the name of the master’s binary log file and the position in the log at the time of the backup, while the tables were locked. When replication is started, it will go to this log file and execute any SQL statements recorded starting from the position given. This is meant to ensure that any data changed while setting up the slave server isn’t missed. To execute the dump file to set up the databases and data on the slave server, copy the dump file to the slave server, make sure MySQL is running, then enter something like the following on the slave:

$ mysql --user=root --password=my_pwd < /tmp/backup.sql


This will execute all of the SQL statements in the dump file, which will include the CREATEand INSERTstatements. Once the backed-up databases are loaded onto the slave server, execute the following SQL statement while logged in as rooton the slave:

START SLAVE;


After this statement is run, the slave will connect to the master and get the changes it missed since the backup. From there, it will stay current by continuously checking the binary log as outlined before.

Backups with Replication


With replication running, it’s an easy task to make a backup of the data. You just need to temporarily stop the slave server from replicating by entering the following SQL statement while logging onto the slave server as rootor a user with SUPERprivileges:

STOP SLAVE;


The slave server knows the position where it left off in the binary log of the master server. So we can take our time making a backup of the replicated databases on the slave server. We can use any backup utility or method we prefer. When the backup is finished, we would enter the following SQL statement from the slave server as rootto restart replication:

START SLAVE;



After entering this statement, there should be a flurry of activity on the slave as it executes the SQL statements that occurred while it was down. In a very short period of time it should be current.

Automating Backups


If replication and the backup process are working properly, we can write a simple shell script to stop replication, back up the data on the slave server, and start the slave again. Such a shell script would look something like this:

#!/bin/sh
date = $(date +%Y%m%d)
mysqladmin --user=root --password=my_pwd stop-slave
mysqldump --user=root --password=my_pwd --lock-all-tables --all-databases > /backups/mysql/backup-${date}.sql
mysqladmin --user=root --password=my_pwd start-slave


In this example, we’re using the mysqladminutility to stop and start replication on the slave. On the first line, you may have noticed that we’re capturing the date using the system function dateand putting it into a good format (e.g., 20050615). This variable is used with mysqldumpin the script for altering the name of the dump file each day. Of course, you can set the file path and the name of the dump file to your preferences.

This is a simple script. You may want to write something more elaborate and allow for error checking. You probably would also want to compress the dump files to save space and write them to a removable media like a tape or CD. Once you have your script set up, test it. If it works, you can add it to your crontabor whatever scheduling utility you use on your server.

Conclusion


Replication is a useful administrative feature of MySQL. It’s an excellent way to be assured of good regular backups of databases. There are more options and SQL statements available for replication than I was able to present here. I cover them individually in my book MySQL in a Nutshell. For active and large systems, you may want to set up more than one slave server for added protection of data. The configuration and concepts are the same for multiple slaves as it is for one slave. For extremely active and large databases, you might want to consider purchasing software like that offered by Emic. Their software costs a bit, but it does an excellent job of handing slave serves for backups and load balancing, especially.