Why You Should Avoid Using “CREATE TABLE AS SELECT” Statement
作者: Alexander Rubin
發布日期:2018-01-10
關鍵詞:create table as select, metadata locks, MySQL, open source database, row locking, table locking
適用范圍: Insight for DBAs, MySQL
原文 http://www.percona.com/blog/2018/01/10/why-avoid-create-table-as-select-statement/
翻譯: 知數堂星耀隊: 芬達、榟榮、劉莉
In this blog post, I’ll provide an explanation why you should avoid using the CREATE TABLE AS SELECT statement.
在這篇博文中,我將解釋為什么你應該避免使用CREATE TABLE AS SELECT語句。
The SQL statement “create table <table_name> as select …” is used to create a normal or temporary table and materialize the result of the select. Some applications use this construct to create a copy of the table. This is one statement that will do all the work, so you do not need to create a table structure or use another statement to copy the structure.
SQL語句“create table <table_name> as select ...”用於創建普通表或臨時表,並物化select的結果。某些應用程序使用這種結構來創建表的副本。一條語句完成所有工作,因此您無需創建表結構或使用其他語句來復制結構。
At the same time there are a number of problems with this statement:
- You don’t create indexes for the new table
- You are mixing transactional and non-transactional statements in one transaction. As with any DDL, it will commit current and unfinished transactions
- CREATE TABLE … SELECT is not supported when using GTID-based replication
- Metadata locks won’t release until the statement is finished
與此同時,這種語句存在許多問題:
- 您不為新表創建索引
- 您在一個事務中混合了事務性和非事務性語句時,與任何DDL一樣,它將提交當前和未完成的事務
- 使用基於GTID的復制時不支持 CREATE TABLE ... SELECT
- 在語句完成之前,元數據鎖不會釋放
CREATE TABLE AS SELECT語句可以把事物變得很糟糕
Let’s imagine we need to transfer money from one account to another (classic example). But in addition to just transferring funds, we need to calculate fees. The developers decide to create a table to perform a complex calculation.
Then the transaction looks like this:
讓我們想象一下,我們需要將錢從一個賬戶轉移到另一個賬戶(經典示例)。但除了轉移資金外,我們還需要計算費用。開發人員決定創建一個表來執行復雜的計算。
然后事務看起來像這樣:
begin;
update accounts set amount = amount - 100000 where account_id=123;
-- now we calculate fees
create table as select ... join ...
update accounts set amount = amount + 100000 where account_id=321;
commit;
The “create table as select … join … ” commits a transaction that is not safe. In case of an error, the second account obviously will not be credited by the second account debit that has been already committed!
Well, instead of “create table … “, we can use “create temporary table …” which fixes the issue, as temporary table creation is allowed.
“create table as select ... join ...”會提交一個事務,這是不安全的。如果出現錯誤,第二個帳戶顯然不會被已經提交的第二個帳戶借記貸記!
好吧,我們可以使用“create temporary table …”來修復問題,而不是“create table … ”,因為允許臨時表創建。
GTID問題
If you try to use CREATE TABLE AS SELECT when GTID is enabled (and ENFORCE_GTID_CONSISTENCY = 1) you get this error:
如果在啟用GTID時嘗試使用CREATE TABLE AS SELECT(並且ENFORCE_GTID_CONSISTENCY = 1),則會出現此錯誤:
General error: 1786 CREATE TABLE ... SELECT is forbidden when @@GLOBAL.ENFORCE_GTID_CONSISTENCY = 1.
The application code may break.
應用程序代碼可能會中斷。
元數據鎖問題
Metadata lock issue for CREATE TABLE AS SELECT is less known. (More information about the metadata locking in general). Please note: MySQL metadata lock is different from InnoDB deadlock, row-level locking and table-level locking.
This quick simulation demonstrates metadata lock:
CREATE TABLE AS SELECT的元數據鎖定問題鮮為人知。(有關元數據鎖定的更多信息)。 請注意:MySQL元數據鎖與InnoDB死鎖、行級鎖、表級鎖是不同的。
以下速模擬演示了元數據鎖定:
會話1:
mysql> create table test2 as select * from test1;
會話2:
mysql> select * from test2 limit 10;
-- blocked statement
語句被阻塞
This statement is waiting for the metadata lock:
此語句正在等待元數據鎖:
會話3:
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+-------------------------------------------
| Id | User | Host | db | Command | Time | State | Info
+----+------+-----------+------+---------+------+---------------------------------+-------------------------------------------
| 2 | root | localhost | test | Query | 18 | Sending data | create table test2 as select * from test1
| 3 | root | localhost | test | Query | 7 | Waiting for table metadata lock | select * from test2 limit 10
| 4 | root | localhost | NULL | Query | 0 | NULL | show processlist
+----+------+-----------+------+---------+------+---------------------------------+-------------------------------------------
The same can happen another way: a slow select query can prevent some DDL operations (i.e., rename, drop, etc.):
同樣地,可以采用另一種方式:慢查詢可以阻塞某些DDL操作(即重命名,刪除等):
mysql> show processlistG
*************************** 1. row ***************************
Id: 4
User: root
Host: localhost
db: reporting_stage
Command: Query
Time: 0
State: NULL
Info: show processlist
Rows_sent: 0
Rows_examined: 0
Rows_read: 0
*************************** 2. row ***************************
Id: 5
User: root
Host: localhost
db: test
Command: Query
Time: 9
State: Copying to tmp table
Info: select count(*), name from test2 group by name order by cid
Rows_sent: 0
Rows_examined: 0
Rows_read: 0
*************************** 3. row ***************************
Id: 6
User: root
Host: localhost
db: test
Command: Query
Time: 5
State: Waiting for table metadata lock
Info: rename table test2 to test4
Rows_sent: 0
Rows_examined: 0
Rows_read: 0
3 rows in set (0.00 sec)
As we can see, CREATE TABLE AS SELECT can affect other queries. However, the problem here is not the metadata lock itself (the metadata lock is needed to preserve consistency). The problem is that the
metadata lock will not be released until the statement is finished.
我們可以看到,CREATE TABLE AS SELECT可以影響其他查詢。但是,這里的問題不是元數據鎖本身(需要元數據鎖來保持一致性)。問題是 在語句完成之前不會釋放元數據鎖。
The fix is simple: copy the table structure first by doing “create table new_table like old_table”, then do “insert into new_table select …”. The metadata lock is still held for the create table part (very short), but isn’t for the “insert … select” part (the total time to hold the lock is much shorter). To illustrate the difference, let’s look at two cases:
- With “create table table_new as select … from table1“, other application connections can’t read from the destination table (table_new) for the duration of the statement (even “show fields from table_new” will be blocked)
- With “create table new_table like old_table” + “insert into new_table select …”, other application connections can’t read from the destination table during the “insert into new_table select …” part.
修復很簡單:首先復制表結構,執行“ create table new_table like old_table”,然后執行“insert into new_table select ...”。元數據鎖仍然在創建表部分(非常短)持有,但“insert … select”部分不會持有(保持鎖定的總時間要短得多)。為了說明不同之處,讓我們看看以下兩種情況:
- 使用“create table table_new as select ... from table1 ”,其他應用程序連接 在語句的持續時間內 無法讀取目標表(table_new)(甚至“show fields from table_new”將被阻塞)
- 使用“create table new_table like old_table”+“insert into new_table select ...”,在“insert into new_table select ...”這部分期間,其他應用程序連接無法讀取目標表。
In some cases, however, the table structure is not known beforehand. For example, we may need to materialize the result set of a complex select statement, involving joins and/or group by. In this case, we can use this trick:
然而,在某些情況下,表結構事先是未知的。例如,我們可能需要物化復雜select語句的結果集,包括joins、and/or、group by。在這種情況下,我們可以使用這個技巧:
create table new_table as select ... join ... group by ... limit 0;
insert into new_table as select ... join ... group by ...
The first statement creates a table structure and doesn’t insert any rows (LIMIT 0). The first statement places a metadata lock. However, it is very quick. The second statement actually inserts rows into the table and doesn’t place a metadata lock.
第一個語句創建一個表結構,不插入任何行(LIMIT 0)。第一個語句持有元數據鎖。但是,它非常快。第二個語句實際上是在表中插入行,而不持有元數據鎖。