ORACLE數據庫中,我們會使用一些SQL語句找出存在隱式轉換的問題SQL,其中網上流傳的一個SQL語句如下,查詢V$SQL_PLAN的字段FILTER_PREDICATES中是否存在INTERNAL_FUNCTION:
SELECT
SQL_ID,
PLAN_HASH_VALUE
FROM
V$SQL_PLAN X
WHERE
X.FILTER_PREDICATES LIKE '%INTERNAL_FUNCTION%'
GROUP BY
SQL_ID,
PLAN_HASH_VALUE;
但是筆者測試驗證發現,有時候,執行計划中出現INTERNAL_FUNCTION,並不一定代表出現了隱式數據類型轉換,下面我們結合這篇博客“What the heck is the INTERNAL_FUNCTION in execution plan predicate section?”來講述一下執行計划謂詞部分中的INTERNAL_FUNCTION到底是什么?這篇博客沒有打算直接翻譯這篇文章,而是想結合自己的理解,來簡單講述一下INTERNAL_FUNCTION。其實官方文檔對INTERNAL_FUNCTION的介紹非常少,最常見的理解,INTERNAL_FUNCTION這種特殊函數用於執行隱式數據類型轉換(implicit datatype conversion),可能來自官方文檔https://docs.oracle.com/cd/E11882_01/server.112/e25523/part_avail.htm#sthref141 。但是這個說法,事實上僅僅部分正確,而不是全部的事實。事實上,ORACLE中找不到INTERNAL_FUNCTION這個函數,通過V$SQLFN_METADATA視圖根本找不到INTERNAL_FUNCTION這個對象。
COL sqlfn_descr HEAD DESCRIPTION FOR A100 WORD_WRAP
COL sqlfn_name HEAD NAME FOR A30
SELECT
func_id
, name sqlfn_name
, offloadable
-- , usage
, minargs
, maxargs
-- this is just to avoid clutter on screen
, CASE WHEN name != descr THEN descr ELSE null END sqlfn_descr
FROM
v$sqlfn_metadata
WHERE
UPPER(name) LIKE UPPER('%&1%')
/
一般而言,我們在執行計划的的謂詞部分發現出現“INTERNAL_FUNCTION”,那么可能意味着出現了隱式類型轉換(implicit data type conversion),下面我先簡單構造一個例子,
SQL> CREATE TABLE t(a VARCHAR2(20), b DATE);
Table created.
SQL> INSERT INTO t VALUES( TO_CHAR(sysdate), sysdate) ;
1 row created.
SQL> commit;
Commit complete.
如下所示,這個SQL會出現隱式數據類型轉換(implicit datatype conversion)
SQL> SELECT * FROM t WHERE a = b;
no rows selected
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID 4ptcbny27y9b0, child number 0
-------------------------------------
SELECT * FROM t WHERE a = b
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
|* 1 | TABLE ACCESS FULL| T | 1 | 21 | 2 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("B"=INTERNAL_FUNCTION("A"))
Note
-----
- dynamic sampling used for this statement
22 rows selected.
通過執行計划,我們看到ORACLE為了能夠比較兩個不同數據類型(字段A與B之間的比較),強制在字段A上加了一個數據類型轉換函數,在ORACLE內部,運算從WHERE a=b 轉換為WHERE TO_DATE(a)=b, 這也是為什么執行計划中出現INTERNAL_FUNCTION的原因-從實際的“二進制”執行計划生成可讀性的執行計划的代碼無法將內部操作碼轉換為相應的適合人們容易理解的函數名稱,因此默認使用“INTERNAL_FUNCTION”字符串取而代之顯示。 英文原文如下,可以對比理解(如果覺得翻譯的不好的話)
What happens here is that Oracle is forced to (implicitly) add a datatype conversion function around column A, to be able to physically compare two different datatypes. Internally Oracle is not running a comparison <strong>"WHERE a = b"</strong> anymore, but rather something like <strong>"WHERE TO_DATE(a) = b"</strong>. This is one of the reasons why the INTERNAL_FUNCTION shows up – the code generating the human-readable execution plan from the actual “binary” execution plan is not able to convert the internal opcode to a corresponding human-readable function name, thus shows a default “INTERNAL_FUNCTION” string there instead.
Un-unparseable Complex Expressions
執行計划中出現“INTERNAL_FUNCTION”,還有一種情況是因為不可分割的復雜表達式(Un-unparseable Complex Expressions),下面通過一個例子來說明一下
SQL> drop table t purge;
Table dropped.
SQL> CREATE TABLE t AS SELECT * FROM dba_objects;
Table created.
SQL> SELECT COUNT(*) FROM t WHERE owner = 'SYS' OR owner = 'SYSTEM';
COUNT(*)
----------
23851
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID 77xzyugx5q3kf, child number 0
-------------------------------------
SELECT COUNT(*) FROM t WHERE owner = 'SYS' OR owner = 'SYSTEM'
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 108 (100)| |
| 1 | SORT AGGREGATE | | 1 | 17 | | |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|* 2 | TABLE ACCESS FULL| T | 22494 | 373K| 108 (7)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("OWNER"='SYS' OR "OWNER"='SYSTEM'))
Note
-----
- dynamic sampling used for this statement
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
現在,我們讓謂詞稍微復雜一點,在查詢條件中添加另一個OR,但這是針對另一列object_id的查詢條件,如下所示:
SQL> SELECT COUNT(*) FROM t WHERE owner = 'SYS' OR owner = 'SYSTEM' OR object_id = 123;
COUNT(*)
----------
23851
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID 9vh8b6ku8sd1t, child number 0
-------------------------------------
SELECT COUNT(*) FROM t WHERE owner = 'SYS' OR owner = 'SYSTEM' OR
object_id = 123
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 111 (100)| |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 30 | | |
|* 2 | TABLE ACCESS FULL| T | 22494 | 659K| 111 (10)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter((INTERNAL_FUNCTION("OWNER") OR "OBJECT_ID"=123))
Note
-----
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
- dynamic sampling used for this statement
24 rows selected.
修改WHERE查詢條件后,OWNER表上的兩個查詢條件消失了,由INTERNAL_FUNCTION替換了,接下來,讓我們用IN運算符,而不是OR,但是上面SQL是不同字段之間的OR,我們需要修改一下SQL語句
SQL> SELECT COUNT(*) FROM t WHERE owner IN ('SYS','SYSTEM','SCOTT') AND object_type = 'TABLE';
COUNT(*)
----------
896
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID gcqgrmtna9g1u, child number 0
-------------------------------------
SELECT COUNT(*) FROM t WHERE owner IN ('SYS','SYSTEM','SCOTT') AND
object_type = 'TABLE'
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 111 (100)| |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| T | 894 | 14304 | 111 (10)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("OBJECT_TYPE"='TABLE' AND INTERNAL_FUNCTION("OWNER")))
20 rows selected.
很不幸,上面執行計划中謂詞部分依然出現了INTERNAL_FUNCTION,我們在邏輯上簡化一下,只搜尋同一個字段上的三個值:
SQL> SELECT COUNT(*) FROM t WHERE owner IN ('SYS','SYSTEM','SCOTT');
COUNT(*)
----------
23857
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID 2qazbqj67y17s, child number 0
-------------------------------------
SELECT COUNT(*) FROM t WHERE owner IN ('SYS','SYSTEM','SCOTT')
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 111 (100)| |
| 1 | SORT AGGREGATE | | 1 | 7 | | |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|* 2 | TABLE ACCESS FULL| T | 24133 | 164K| 111 (10)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("OWNER"='SCOTT' OR "OWNER"='SYS' OR "OWNER"='SYSTEM'))
19 rows selected.
如上所示,它確實生效了,ORACLE已將IN謂詞轉換為(或至少在執行計划中顯示了)了一堆OR-ed條件(針對同一列)
你可能已經看到了前面的例子的執行計划輸出內容– DBMS_XPLAN.DISPLAY_CURSOR無法解釋在單個執行計划步驟中應用的“復雜”的復合謂詞,其中包括多個不同的列,並且至少其中一個列具有多個要檢查的值(例如列表中或OR-ed謂詞)
DISPLAY_CURSOR從何處獲取數據並進行解釋呢?
DBMS_XPLAN.DISPLAY_CURSOR從V$SQL_PLAN獲取其執行計划的相關數據,謂詞部分來自ACCESS_PREDICATES和FILTER_PREDICATES列。但是當我直接查詢V$SQL_PLAN時,我仍然看到相同的問題:
SQL> SELECT id, filter_predicates FROM v$sql_plan WHERE sql_id = 'gcqgrmtna9g1u';
ID FILTER_PREDICATES
---------- ------------------------------------------------------------
0
1
2 (INTERNAL_FUNCTION("OWNER") AND "OBJECT_TYPE"='TABLE')
你可能已經注意到,上面的原始ORed條件周圍也有括號(),這在9i中,意味着謂詞周圍的“二進制”執行計划中存在“無法解釋的”內部函數,但是在這種情況下(如10g +支持internal_function命名),不應出現空白的函數名稱……不確定為什么會出現這種情況,但這對本篇文章來說太深入了。
V$SQL_PLAN視圖本身訪問庫高速緩存(library cache)中的實際“二進制”子游標(在使用了適當的latches/pins/mutexe之后)並對其進行解析。為什么用這樣的術語-其實並不是根據人類容易理解的輸入並將其轉換為計算機可理解的“二進制”格式。悄悄相反– V$SQL_PLAN訪問游標中的“二進制”執行計划的內存結構,並將其轉換為人類可讀的執行計划輸出。甚至還有一個參數控制此V$SQL_PLAN的行為,如果將其設置為false,則ACCESS_PREDICATES和FILTER_PREDICATES列將為空:
這段真不好翻譯(有可能翻譯不當),參考英文原文如下:
The V$SQL_PLAN view itself accesses the actual “binary” child cursor in library cache (after taking appropriate latches/pins/mutexes) and UNPARSES it. Why such term – well isn’t parsing something that takes a human readable input and translates it into computer-understandable “binary” format. Thus unparsing is the opposite – V$SQL_PLAN accesses the cursor’s “binary” execution plan memory structure and translates it to human-readable execution plan output. There’s even a parameter controlling this V$SQL_PLAN behavior, if it’s set to false, the ACCESS_PREDICATES and FILTER_PREDICATES columns will be empty there:
SQL> @pd unparse
Show all parameters and session values from x$ksppi/x$ksppcv...
NAME VALUE DESCRIPTION
----------------------------- --------- -----------------------------------------------
_cursor_plan_unparse_enabled TRUE enables/disables using unparse to build
projection/predicates
順便說一句,為什么我總是說“二進制”執行計划並用雙引號括起來? 這是因為我想強調,ORACLE的實際執行計划並不像我們在屏幕上看到的輸出的文本那樣,這些輸出的“執行計划”只是為了在troubleshooting的時候,更好的適應人類的閱讀習慣而生成的文本(這里其實就是說轉換成了符合人類閱讀系統的文本),執行計划也不是真正的可執行二進制文件(如oracle.exe中一樣),也沒有直接反饋給CPU執行。 庫緩存子游標中的物理執行計划(physical execution plan)是一堆操作碼(a bunch of opcodes),object_id和指針,用於定義行源執行的層次結構和順序。 SQL執行引擎去循環遍歷這些操作碼,對其進行解碼,然后知道下一步該做什么(要調用哪個rowsource函數)。
因此,如上所述,某些具有復雜AND / OR條件的謂詞被DBMS_XPLAN顯示為INTERNAL_FUNCTION()。DISPLAY_CURSOR和V$SQL_PLAN因為它們也無法完全解碼(解析)執行計划信息。
Using the good old EXPLAIN PLAN
不過有個好消息! 舊的EXPLAIN PLAN命令能夠正確的解析這些復雜謂詞(當然僅僅是其中一部分),當EXPLAIN PLAN以一種特殊、更加儀器化的方式(more instrumented way)解析給定的SQL語句時,它顯然手頭有更多信息(並且它還使用了更多的內存)。或者可能只是誰寫了V$SQL_PLAN,沒有編寫一段代碼來解析更復雜的謂詞:),如下所示:
SQL> EXPLAIN PLAN FOR
2 SELECT COUNT(*) FROM t WHERE owner IN ('SYS','SYSTEM','SCOTT') AND object_type = 'TABLE';
Explained.
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 111 (10)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| T | 894 | 14304 | 111 (10)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
---------------------------------------------------
2 - filter("OBJECT_TYPE"='TABLE' AND ("OWNER"='SCOTT' OR
"OWNER"='SYS' OR "OWNER"='SYSTEM'))
15 rows selected.
SQL>
這真是一個奇跡,INTERNAL_FUNCTION消失不見了,所有的謂詞都正確的顯示了,EXPLAIN PLAN命令在這里非常有用。
因此,盡管我通常不使用EXPLAIN PLAN命令,因為EXPLAIN PLAN輸出的執行計划可能會騙你,但是,每當我在DISPLAY_CURSOR/V$SQL_PLAN/SQL Monitor輸出中看到INTERNAL_FUNCTION時,我都會運行EXPLAIN PLAN命令執行同一個SQL,希望快速找出其中的謂詞INTERNAL_FUNCTION代表的真正意義。
參考資料:
https://blog.tanelpoder.com/2013/01/16/what-the-heck-is-the-internal_function-in-execution-plan-predicate-section/
https://docs.oracle.com/cd/E11882_01/server.112/e25523/part_avail.htm#sthref141