Schema Migrations in Large-Scale MySQL Databases
On small MySQL databases, making schema changes is as simple as running an ALTER query. However, with large databases, it’s not so easy. In this article you will learn how to tackle this challenge.
Problems of schema migrations
When dealing with large-scale databases, we often encounter the following challenges during schema migrations:
Downtime and Service Disruption - Schema changes many times require taking the database offline, resulting in downtime for applications depending on the database.
Data Integrity Issues - Data consistency can be problematic during a migration, especially when altering large tables.
Performance Degradation - Adding indexes, altering large tables, adding columns etc. can cause performance bottlenecks, slow queries, and increased load on the database during migration itself.
Locking Issues - Most of schema changes result in table locks, blocking writes to the database.
Renaming approach
Common approach when dealing with these kind of problems is, to take advantage of MySQL’s online operations and work a solution around that. One of those instantanious operations would be for example rename operation and here’s how we utilize rename to be able to do schema migration with zero downtime:
Step 1: Copy schema of Table1
and create empty Table1_tmp
Step 2: Write down and drop foreign keys from Table1_tmp
Reason we do this, is because foreing keys checks pose a lot of additional processing for the database when it’s inserting data. If we are to sync the data into this new table, we want to keep processing at minimum, to have as fast sync as possible.
Step 3: Apply schema migration on Table1_tmp
Make sure that if you are adding column, you specify the default value that will be used while inserting data from
Table1
toTable1_tmp
for existing data.
Step 4: Start syncing data from Table1
to Table1_tmp
During the sync,
Table1
can be in use. This means, we are free to use database, but we need to make sure we have binlog enabled, so that write queries can be replicated on the newTable1_tmp
and we won’t have it out of sync after filling of data fromTable1
finishes.
Step 5: Once we are in sync, just rename tables
Rename is instantanious process, so what we need to do is rename Table1 to Table1_old for example and right after that, rename Table1_tmp to Table1.
Step 6: Add the dropped foreign keys
Available tooling
To help us with the process, we already have few open source tools available today. One of them would be gh-ost (https://github.com/github/gh-ost), but there are others, like pt-online-schema-change (https://docs.percona.com/percona-toolkit/pt-online-schema-change.html).
Gh-ost snippets
To save you some time, here are few snippets we usually run when doing the migrations using gh-ost, so you can modify them to your liking and test things yourself.
Running gh-ost
To run gh-ost you will need to install it on a server that has access to your database. Replace all ${VARS}
with appropriate details from your database and modify the “ADD INDEX …”
with your schema change statement that you wish to perform over the table.
gh-ost \
--assume-rbr \
--user="${DB_USER}" \
--password="${DB_PASS}" \
--host="${DB_HOST}" \
--allow-on-master \
--database="${DB_NAME}" \
--table="${DB_TABLE}" \
--verbose \
--alter="\
ADD INDEX table_name (column_one,column_two,datetime)," \
--initially-drop-ghost-table \
--skip-foreign-key-checks \
--max-load=Threads_running=30 \
--chunk-size=50 \
--cut-over=default \
--exact-rowcount \
--concurrent-rowcount \
--serve-socket-file=/tmp/ghost.sock \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--execute
This command will create copy of schema of your ${
DB_TABLE
} table, run the --alter commands on the newly created table and start syncing data from main table to this new one.
Needless to say, you should run this command in
screen
or something similar, that you can detach from, since it will probably take long time.
For further information on flags that you can/should choose, have a look at the cheatsheet.
Checking status
To check status of the sync, there is a convenient socket available.
echo status | nc -U /tmp/ghost.sock
# Migrating `mydb`.`mytable`; Ghost table is `mydb`.`_mytable_gho`
# Migrating ip-172-27-0-49:3306; inspecting ip-172-27-0-72:3306; executing on ghost
# Migration started at Wed Jan 09 16:25:10 +0000 2024
# chunk-size: 500; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=30; critical-load: ; nice-ratio: 1.500000
# throttle-additional-flag-file: /tmp/ghost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.postpone.flag [set]
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/ghost.test.sock
Copy: 22404000/81210224 27.6%; Applied: 4052; Backlog: 0/1000; Time: 2h21m25s(total), 2h21m24s(copy); streamer: mysql-bin-changelog.003876:31236708; State: migrating; ETA: 6h11m11s
Throttling
If you see impact on production due to this process, you can set niceness and reduce burden on the database:
echo "nice-ratio=1.5" | nc -U /tmp/gh-ost.sock
Perform rename
Once your tables are in sync, you are free to perform cut off. To do that, gh-ost constantly checks specific file and if you remove it, the running process will perform cut off and exit, finishing the migration process.
rm /tmp/ghost.postpone.flag
More commands…
If you are interested in more interactive commands, you can check here:
https://github.com/github/gh-ost/blob/master/doc/interactive-commands.md
Final thoughts
Although pt-online-schema-change is a powerful tool, we prefer using gh-ost for the job. Its additional features, which are essential for managing large databases, make it invaluable. Of course you are free to test both and choose one of your liking.