最近發現,我們有些環境的tomcat應用啟動非常緩慢,大部分在3-5分鍾,有個測試環境更加階段,要十幾分鍾才能啟動完成。經過仔細分析,是一個查詢INFORMATION_SCHEMA庫中數據字典信息的查詢異常緩慢,該語句如下:
SELECT c.COLUMN_NAME, c.TABLE_NAME FROM information_schema.TABLE_CONSTRAINTS AS t, information_schema.KEY_COLUMN_USAGE AS c WHERE t.TABLE_NAME = c.TABLE_NAME AND t.TABLE_SCHEMA = c.CONSTRAINT_SCHEMA AND t.CONSTRAINT_SCHEMA = 'hs_tatrade2' AND t.CONSTRAINT_TYPE = 'PRIMARY KEY'
以前從來都沒遇到這種問題,也很少關心mysql數據字典查詢的性能問題,因為幾乎沒有遇到過。
查看show processlist,一直在after opening table等待。。。。
看了下執行計划以及information_schema中表結構的定義,因為都是內存表,都沒有索引,這兩張表都只有數百條記錄,按說即使沒有索引也不會這么慢。。。
經網上搜尋,有人有不少帖子提及是因為innodb_stats_on_metadata=ON導致查詢information_schema時更新統計信息的原因。經測試,不是這個原因(其實,我現在相信網上80%以上的所謂分析帖子都是理論上的測試,並不是真正遇到,尤其是所謂的很多專家)。
再次尋找到mysql官方文檔,https://dev.mysql.com/doc/refman/5.7/en/information-schema-optimization.html,如下:
8.2.3 Optimizing INFORMATION_SCHEMA Queries
1) Try to use constant lookup values for database and table names in the WHERE
clause
You can take advantage of this principle as follows:
-
To look up databases or tables, use expressions that evaluate to a constant, such as literal values, functions that return a constant, or scalar subqueries.
-
Avoid queries that use a nonconstant database name lookup value (or no lookup value) because they require a scan of the data directory to find matching database directory names.
-
Within a database, avoid queries that use a nonconstant table name lookup value (or no lookup value) because they require a scan of the database directory to find matching table files.
This principle applies to the INFORMATION_SCHEMA
tables shown in the following table, which shows the columns for which a constant lookup value enables the server to avoid a directory scan. For example, if you are selecting from TABLES
, using a constant lookup value for TABLE_SCHEMA
in the WHERE
clause enables a data directory scan to be avoided.
Table | Column to specify to avoid data directory scan | Column to specify to avoid database directory scan |
---|---|---|
COLUMNS |
TABLE_SCHEMA |
TABLE_NAME |
KEY_COLUMN_USAGE |
TABLE_SCHEMA |
TABLE_NAME |
PARTITIONS |
TABLE_SCHEMA |
TABLE_NAME |
REFERENTIAL_CONSTRAINTS |
CONSTRAINT_SCHEMA |
TABLE_NAME |
STATISTICS |
TABLE_SCHEMA |
TABLE_NAME |
TABLES |
TABLE_SCHEMA |
TABLE_NAME |
TABLE_CONSTRAINTS |
TABLE_SCHEMA |
TABLE_NAME |
TRIGGERS |
EVENT_OBJECT_SCHEMA |
EVENT_OBJECT_TABLE |
VIEWS |
TABLE_SCHEMA |
TABLE_NAME |
意思就是查詢上述這些表的時候,務必帶上TABLE_SCHEMA=或者XXX_SCHEMA,以避免數據目錄掃描。而我們的場景剛好是TABLE_CONSTRAINTS沒有使用TABLE_SCHEMA,雖然關聯使用了TABLE_SCHEMA,但這是沒有用的。
經過將SQL更改為如下:
SELECT c.COLUMN_NAME, c.TABLE_NAME FROM information_schema.TABLE_CONSTRAINTS AS t, information_schema.KEY_COLUMN_USAGE AS c WHERE t.TABLE_NAME = c.TABLE_NAME AND t.TABLE_SCHEMA = c.CONSTRAINT_SCHEMA AND t.TABLE_SCHEMA = 'hs_tatrade2' AND c.TABLE_SCHEMA = 'hs_tatrade2' AND t.CONSTRAINT_TYPE = 'PRIMARY KEY'
瞬間就飛快了。