DDL Failures in MariaDB Galera Cluster

Geoff MonteeDDL, Galera Cluster, MariaDB, MySQL0 Comments

A MariaDB support customer recently asked me what would happen if a Data Definition Language (DDL) statement failed to complete on one or more nodes in MariaDB Galera Cluster. In this blog post, I will demonstrate what would happen.

The demonstration below was performed on a 2-node cluster running MariaDB 10.1, but other Galera Cluster distributions should work similarly.

Schema Upgrades in Galera Cluster

Schema upgrades and DDL in Galera Cluster are handled a bit differently than in a standalone MariaDB or MySQL server.

Transactions in Galera Cluster are replicated in a “virtually synchronous” manner. This means that unless a particular node is desynchronized from the cluster, all replicated tables need to have identical (or at least compatible) definitions on all nodes. If a node tries to replicate data for a particular table and if some nodes have incompatible definitions for that table, those nodes will not be able to apply the transactions to their copy of the table. This also means that incompatible schema upgrades should happen on all nodes at the same time.

Galera Cluster provides two methods of applying schema upgrades, and you can switch between them using the wsrep_OSU_method option. One method, Total Order Isolation (TOI), can be used to apply incompatible changes in a slow, but safe way. The other method, Rolling Schema Upgrade (RSU), can be used to apply backward-compatible changes in a faster way. These are described in more detail in the Galera Cluster documentation page about Schema Upgrades.

But since DDL is treated specially in Galera Cluster, what happens if some DDL fails to complete successfully on one or more nodes?

A DDL Failure in TOI Mode

First, lets look at what happens when DDL fails in TOI mode. Let’s make sure that TOI mode is currently set:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'wsrep_osu_method';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| wsrep_osu_method | TOI |
+------------------+-------+
1 row in set (0.00 sec)

It is, so let’s create a table by executing the following on one node:

MariaDB [db1]> CREATE TABLE tab (
-> id int PRIMARY KEY,
-> str varchar(50)
-> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.03 sec)

Let’s make sure that this table exists on both nodes.

Node 1:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Node 2:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

We want to see DDL fail, so lets do some setup for that by making one node have a slightly different definition of the table. We can do so by running some DDL in RSU mode:

MariaDB [db1]> SET wsrep_osu_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [db1]> ALTER TABLE tab ADD COLUMN num int DEFAULT NULL;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0

Do the two nodes have the same definition of the table now?

Node 1:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
`num` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Node 2:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

They now have different definitions, so let’s execute some DDL on node 1 that will fail on node 2. We also need to set wsrep_OSU_method back to TOI.

MariaDB [db1]> SET wsrep_osu_method='TOI';
Query OK, 0 rows affected (0.00 sec)

MariaDB [db1]> ALTER TABLE tab MODIFY COLUMN num bigint DEFAULT NULL;
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0

Since node 2 does not have the num column, this DDL should fail on that node. Lets look at the definition of the table on both nodes now:

Node 1:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
`num` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Node 2:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

The DDL obviously failed on node 2, but it doesn’t look like anything happened. Lets look at node 2’s error log:

2016-07-28 14:42:48 140579533392640 [ERROR] Slave SQL: Error ‘Unknown column ‘num’ in ‘tab” on query. Default database: ‘db1’. Query: ‘ALTER TABLE tab MODIFY COLUMN num bigint DEFAULT NULL’, Internal MariaDB error code: 1054
2016-07-28 14:42:48 140579533392640 [Warning] WSREP: RBR event 1 Query apply warning: 1, 3
2016-07-28 14:42:48 140579533392640 [Warning] WSREP: Ignoring error for TO isolated action: source: d25d604b-54f0-11e6-a77e-f681b74c50f4 version: 3 local: 0 state: APPLYING flags: 65 conn_id: 5 trx_id: -1 seqnos (l: 7, g: 3, s: 2, d: 2, ts: 1017404963701)

Node 2 just ignored the error!

Now what actually happens when we try to insert something into the num field?

Node 1:

MariaDB [db1]> INSERT INTO tab (id, str, num) VALUES (1, 'str1', 1);
Query OK, 1 row affected (0.01 sec)

MariaDB [db1]> SELECT * FROM db1.tab;
+----+------+------+
| id | str | num |
+----+------+------+
| 1 | str1 | 1 |
+----+------+------+
1 row in set (0.00 sec)

Node 2:

MariaDB [db1]> SELECT * FROM tab;
+----+------+
| id | str |
+----+------+
| 1 | str1 |
+----+------+
1 row in set (0.00 sec)

The extra column at the end of the list is just ignored! This is because Galera Cluster follows many of the same compatibility rules as standard MySQL replication, and an extra column at the end of the list is considered a valid difference in standard MySQL replication.

But lets see what happens if the difference is invalid.

Node 1:

MariaDB [db1]> SET wsrep_osu_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [db1]> ALTER TABLE tab DROP COLUMN num;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0

MariaDB [db1]> ALTER TABLE tab ADD COLUMN num int DEFAULT NULL AFTER id;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`num` int(11) DEFAULT NULL,
`str` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Node 2:

MariaDB [db1]> SHOW CREATE TABLE tab;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) NOT NULL,
`str` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And now let’s try to insert some data:

Node 1:

MariaDB [db1]> INSERT INTO tab (id, num, str) VALUES (2, 1, 'str2');
Query OK, 1 row affected (0.00 sec)

MariaDB [db1]> SELECT * FROM db1.tab;
+----+------+------+
| id | num | str |
+----+------+------+
| 1 | NULL | str1 |
| 2 | 1 | str2 |
+----+------+------+
2 rows in set (0.00 sec)

Node 2:

MariaDB [db1]> SELECT * FROM tab;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 5
Current database: db1

+----+------+------+
| id | num | str |
+----+------+------+
| 1 | NULL | str1 |
| 2 | 1 | str2 |
+----+------+------+
2 rows in set (0.00 sec)

You might notice two weird things here:

  • Our client was disconnected from node 2.
  • Node 2 has the num column now.

That’s weird! Let’s look at node 2’s error log. What do we see?

First, we can see that it tried to apply the transaction 4 times:

2016-07-28 14:55:34 140579533392640 [ERROR] Slave SQL: Column 1 of table ‘db1.tab’ cannot be converted from type ‘int’ to type ‘varchar(50)’, Internal MariaDB error code: 1677
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: RBR event 2 Write_rows_v1 apply warning: 3, 5
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: Failed to apply app buffer: seqno: 5, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time
2016-07-28 14:55:34 140579533392640 [ERROR] Slave SQL: Column 1 of table ‘db1.tab’ cannot be converted from type ‘int’ to type ‘varchar(50)’, Internal MariaDB error code: 1677
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: RBR event 2 Write_rows_v1 apply warning: 3, 5
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: Failed to apply app buffer: seqno: 5, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2016-07-28 14:55:34 140579533392640 [ERROR] Slave SQL: Column 1 of table ‘db1.tab’ cannot be converted from type ‘int’ to type ‘varchar(50)’, Internal MariaDB error code: 1677
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: RBR event 2 Write_rows_v1 apply warning: 3, 5
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: Failed to apply app buffer: seqno: 5, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2016-07-28 14:55:34 140579533392640 [ERROR] Slave SQL: Column 1 of table ‘db1.tab’ cannot be converted from type ‘int’ to type ‘varchar(50)’, Internal MariaDB error code: 1677
2016-07-28 14:55:34 140579533392640 [Warning] WSREP: RBR event 2 Write_rows_v1 apply warning: 3, 5
2016-07-28 14:55:34 140579533392640 [ERROR] WSREP: Failed to apply trx: source: d25d604b-54f0-11e6-a77e-f681b74c50f4 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 5 trx_id: 76646 seqnos (l: 9, g: 5, s: 4, d: 3, ts: 1783539089510)

When that failed, the failed node determined that it was inconsistent with the cluster, so it shot itself in the head:

2016-07-28 14:55:34 140579533392640 [ERROR] WSREP: Failed to apply trx 5 4 times
2016-07-28 14:55:34 140579533392640 [ERROR] WSREP: Node consistency compromized, aborting…
2016-07-28 14:55:34 140579533392640 [Note] WSREP: Closing send monitor…
2016-07-28 14:55:34 140579533392640 [Note] WSREP: Closed send monitor.
2016-07-28 14:55:34 140579533392640 [Note] WSREP: gcomm: terminating thread
2016-07-28 14:55:34 140579533392640 [Note] WSREP: gcomm: joining thread
2016-07-28 14:55:34 140579533392640 [Note] WSREP: gcomm: closing backend
…snip…
2016-07-28 14:55:35 140579533392640 [Note] WSREP: /usr/sbin/mysqld: Terminated.

And it was automatically restarted by systemd, at which point it did an SST:

2016-07-28 14:55:44 139821320411264 [Note] WSREP: Read nil XID from storage engines, skipping position init
2016-07-28 14:55:44 139821320411264 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib64/galera/libgalera_smm.so’
2016-07-28 14:55:44 139821320411264 [Note] WSREP: wsrep_load(): Galera 25.3.15(r3578) by Codership Oy loaded successfully.
2016-07-28 14:55:44 139821320411264 [Note] WSREP: CRC-32C: using hardware acceleration.
2016-07-28 14:55:44 139821320411264 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
…snip…
2016-07-28 14:55:45 139821320096512 [Note] WSREP: New cluster view: global state: d25dbeb7-54f0-11e6-bac9-c2bc3c331fb6:5, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 3
2016-07-28 14:55:45 139821320096512 [Warning] WSREP: Gap in state sequence. Need state transfer.
2016-07-28 14:55:45 139821024540416 [Note] WSREP: Running: ‘wsrep_sst_rsync –role ‘joiner’ –address ‘172.31.22.174’ –datadir ‘/var/lib/mysql/’ –parent ‘2159’ –binlog ‘mariadb-bin’ ‘
2016-07-28 14:55:45 139821320096512 [Note] WSREP: Prepared SST request: rsync|172.31.22.174:4444/rsync_sst
2016-07-28 14:55:45 139821320096512 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-07-28 14:55:45 139821320096512 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-07-28 14:55:45 139821097465600 [Note] WSREP: Service thread queue flushed.
2016-07-28 14:55:45 139821320096512 [Note] WSREP: Assign initial position for certification: 5, protocol version: 3
2016-07-28 14:55:45 139821097465600 [Note] WSREP: Service thread queue flushed.
2016-07-28 14:55:45 139821320096512 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (d25dbeb7-54f0-11e6-bac9-c2bc3c331fb6): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2016-07-28 14:55:45 139821041313536 [Note] WSREP: Member 1.0 () requested state transfer from ‘*any*’. Selected 0.0 ()(SYNCED) as donor.
2016-07-28 14:55:45 139821041313536 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 5)
2016-07-28 14:55:45 139821320096512 [Note] WSREP: Requesting state transfer: success, donor: 0
2016-07-28 14:55:47 139821049706240 [Note] WSREP: (e31e0085, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2016-07-28 14:55:48 139821041313536 [Note] WSREP: 0.0 (): State transfer to 1.0 () complete.
2016-07-28 14:55:48 139821041313536 [Note] WSREP: Member 0.0 () synced with group.
WSREP_SST: [INFO] Extracting binlog files: (20160728 14:55:48.132)
mariadb-bin.000038
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 2199 (20160728 14:55:48.137)
WSREP_SST: [INFO] Joiner cleanup done. (20160728 14:55:48.642)
2016-07-28 14:55:48 139821320411264 [Note] WSREP: SST complete, seqno: 5

The State Snapshot Transfer (SST) re-imaged node 2 based on an rsync transfer from node 1, so that explains why node 2 suddenly had a consistent definition of our table.

A DDL Failure in RSU Mode

We’ve seen what happens when DDL fails in TOI mode, but what happens when it fails in RSU mode? This is easy to demonstrate:

MariaDB [db1]> SET wsrep_osu_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [db1]> ALTER TABLE tab ADD COLUMN num int DEFAULT NULL;
ERROR 1060 (42S21): Duplicate column name 'num'

In RSU mode, DDL works in similar ways to how it works on a standalone MariaDB/MySQL server, so nothing catastrophic happens when DDL fails. It simply returns an error.

Conclusion

DDL can be kind of weird in Galera Cluster, but many of the quirks are in place to protect the integrity of your data.

Has anyone else noticed strange failures that can happen with DDL in Galera Cluster?

Leave a Reply

Your email address will not be published. Required fields are marked *

3,669 Spambots Blocked by Simple Comments