MySQL SELECT_LEX與subselect 執行 源碼閱讀筆記


MySQL SELECT_LEX與subselect 執行 源碼閱讀筆記

Based on MySQL8.0 community version

JOIN::exec的細節不在此文中介紹。

SELECT_LEX

代碼中常見簡稱為select。一個SELECT_LEX可以理解成一個SELECT FROM WHERE的query block(可能是subselect,可能是最外層最頂層outer_most query)。SELECT_LEX有prepare和optimize方法,沒有execute方法,SELECT_LEX_UNIT在execute時是直接調用select->join->exec()的。

部分重要成員變量(一些parse相關的變量暫時跳過):

  /**
    Intrusive double-linked list of all query blocks within the same
    query expression.
    
    SELECT_LEX在union連接下從屬於一個SELECT_LEX_UNIT,其鏈表鏈接在此維護。
  */
  SELECT_LEX *next;
  SELECT_LEX **prev;

	/// The query expression containing this query block.
  /// 即包含當前select的父對象SELECT_LEX_UNIT
  SELECT_LEX_UNIT *master;
  /// The first query expression contained within this query block.
  /// 即當前query包含的第一個子select組SELECT_LEX_UNIT
  SELECT_LEX_UNIT *slave;
	/// SELECT_LEX和SELECT_LEX_UNIT的組合關系參考:https://dev.mysql.com/doc/internals/en/select-structure.html


  /// Intrusive double-linked global list of query blocks.
	/// 當前query下的全局select列表(方便遍歷)
  SELECT_LEX *link_next;
  SELECT_LEX **link_prev;

  /// Result of this query block
	/// handle最后的query result的對象,發送給client或者寫文件等
  Query_result *m_query_result;

  /// Describes context of this query block (e.g if it is a derived table).
  /// 默認是UNSPECIFIED,parse階段之后實際只有UNION_TYPE和DERIVED_TABLE_TYPE有用,一個是union select,另一個是當前select會生成derived_table。
  enum sub_select_type linkage;

	/**
    Condition to be evaluated after all tables in a query block are joined.
    After all permanent transformations have been conducted by
    SELECT_LEX::prepare(), this condition is "frozen", any subsequent changes
    to it must be done with change_item_tree(), unless they only modify AND/OR
    items and use a copy created by SELECT_LEX::get_optimizable_conditions().
    Same is true for 'having_cond'.
  */
	/// where 條件
  Item *m_where_cond;

  /// Condition to be evaluated on grouped rows after grouping.
	/// having 條件。TODO:部分having會轉為where查詢?
  Item *m_having_cond;

	/**
    Saved values of the WHERE and HAVING clauses. Allowed values are:
     - COND_UNDEF if the condition was not specified in the query or if it
       has not been optimized yet
     - COND_TRUE if the condition is always true
     - COND_FALSE if the condition is impossible
     - COND_OK otherwise
  */
  Item::cond_result cond_value;	// where cond result
  Item::cond_result having_value; // having result

	// 一般都是UNSPECIFIED_OLAP_TYPE, ROLLUP參見:https://dev.mysql.com/doc/refman/8.0/en/group-by-modifiers.html
	enum olap_type olap;
	

	/**
    After optimization it is pointer to corresponding JOIN. This member
    should be changed only when THD::LOCK_query_plan mutex is taken.
  */
  // 該select對應的join對象
  JOIN *join;
  /// join list of the top level
  List<TABLE_LIST> top_join_list;
  /// list for the currently parsed join
  /// 主要作用於parse,但optimize等地方也有少量引用到它,感覺可能是誤用?可能應當用top_join_list
  List<TABLE_LIST> *join_list;
  /// table embedding the above list
  TABLE_LIST *embedding;
  /// List of semi-join nests generated for this query block
  List<TABLE_LIST> sj_nests;
  /**
    Points to first leaf table of query block. After setup_tables() is done,
    this is a list of base tables and derived tables. After derived tables
    processing is done, this is a list of base tables only.
    Use TABLE_LIST::next_leaf to traverse the list.
  */
  // 指向第一個真實表
  TABLE_LIST *leaf_tables;

  /**
     If this query block is a recursive member of a recursive unit: the
     TABLE_LIST, in this recursive member, referencing the query
     name.
  */
  // 指向要遞歸的表(in recursive CTE)
  TABLE_LIST *recursive_reference;
  /**
     To pass the first steps of resolution, a recursive reference is made to
     be a dummy derived table; after the temporary table is created based on
     the non-recursive members' types, the recursive reference is made to be a
     reference to the tmp table. Its dummy-derived-table unit is saved in this
     member, so that when the statement's execution ends, the reference can be
     restored to be a dummy derived table for the next execution, which is
     necessary if we have a prepared statement.
     WL#6570 should allow to remove this.
  */
  SELECT_LEX_UNIT *recursive_dummy_unit;
  

SELECT_LEX_UNIT

代碼中常見簡稱為unit。一個SELECT_LEX_UNIT表示一組由UNION / INTERSECT / EXCEPT等SELECT級別的邏輯操作組合成的一組SELECT結構,不過目前僅支持UNION,因此只是簡單的列表結構。

部分重要成員變量:

  /**
    Intrusive double-linked list of all query expressions
    immediately contained within the same query block.
    SELECT_LEX_UNIT從屬於一個SELECT_LEX,其鏈表鏈接在此維護。
  */
  SELECT_LEX_UNIT *next;
  SELECT_LEX_UNIT **prev;

  /**
    The query block wherein this query expression is contained,
    NULL if the query block is the outer-most one.
  */
  /// 即包含當前SELECT_LEX_UNIT的父對象SELECT_LEX
  SELECT_LEX *master;
  /// The first query block in this query expression.
  /// 即當前query包含的第一個子select  SELECT_LEX
  SELECT_LEX *slave;

  bool prepared;   ///< All query blocks in query expression are prepared
  bool optimized;  ///< All query blocks in query expression are optimized
  bool executed;   ///< Query expression has been executed

  TABLE_LIST result_table_list;
  // A UNION B的結果
  Query_result_union *union_result;
  TABLE *table; /* temporary table using for appending UNION results */
  /// Object to which the result for this query expression is sent
  Query_result *m_query_result;

  // list of fields which points to temporary table for union
  List<Item> item_list;
  /*
    list of types of items inside union (used for union & derived tables)

    Item_type_holders from which this list consist may have pointers to Field,
    pointers is valid only after preparing SELECTS of this unit and before
    any SELECT of this unit execution

    TODO:
    Possibly this member should be protected, and its direct use replaced
    by get_unit_column_types(). Check the places where it is used.
  */
  List<Item> types;

	/* LIMIT clause runtime counters */
  ha_rows select_limit_cnt, offset_limit_cnt;
  /// Points to subquery if this query expression is used in one, otherwise NULL
  // 包含當前UNIT的Item_subselect(如果是在WHERE/HAVING 的subselect中的話)
  Item_subselect *item;

	/**
    Helper query block for query expression with UNION or multi-level
    ORDER BY/LIMIT
  */
  // 對於ORDER BY或UNION后的結果,需再借助一個fake的select將其發送出去
  // SELECT A UNION SELECT B -->  
  //    SELECT * FROM (SELECT A UNION SELECT B) AS UNION_RESULT
  SELECT_LEX *fake_select_lex;
  /**
    SELECT_LEX that stores LIMIT and OFFSET for UNION ALL when no
    fake_select_lex is used.
  */
  SELECT_LEX *saved_fake_select_lex;
  /**
     Points to last query block which has UNION DISTINCT on its left.
     In a list of UNIONed blocks, UNION is left-associative; so UNION DISTINCT
     eliminates duplicates in all blocks up to the first one on its right
     included. Which is why we only need to remember that query block.
  */
  // 目前實際不支持 rec0 UNION ALL rec1 UNION DISTINCT rec2 UNION ALL rec3 的情況,即UNION DISTINCT后面不能再加UNION ALL。而且mixed UNION中DISTINCT會覆蓋左邊的ALL語義,所以不知道mix UNION有何用。
  SELECT_LEX *union_distinct;

  /**
    The WITH clause which is the first part of this query expression. NULL if
    none.
  */
  // 即該查詢是否一個當前層帶WITH的CTE查詢
  PT_with_clause *m_with_clause;
  /**
    If this query expression is underlying of a derived table, the derived
    table. NULL if none.
  */
  // 當前SELECT將要生成的生成表(如果會生成derived_table的話)
  TABLE_LIST *derived_table;
  /**
     First query block (in this UNION) which references the CTE.
     NULL if not the query expression of a recursive CTE.
  */
  // 對於當前UNIT,first_recursive表示第一個CTE select,first_recursive之后的都必然是recursive的select,之前的必然都是非recursive的select。
  SELECT_LEX *first_recursive;

  /**
    True if the with-recursive algorithm has produced the complete result.
    In a recursive CTE, a JOIN is executed several times in a loop, and
    should not be cleaned up (e.g. by join_free()) before all iterations of
    the loop are done (i.e. before the CTE's result is complete).
  */
  // CTE recursive讀的時候判斷是否讀結束
  bool got_all_recursive_rows;

SELECT路徑

以官方文檔CTE中的employees表為例子

CREATE TABLE employees (
  id         INT PRIMARY KEY NOT NULL,
  name       VARCHAR(100) NOT NULL,
  manager_id INT NULL
) ENGINE='innodb';

INSERT INTO employees VALUES
(333, "Yasmina", NULL),  # Yasmina is the CEO (manager_id is NULL)
(198, "John", 333),      # John has ID 198 and reports to 333 (Yasmina)
(692, "Tarek", 333),
(29, "Pedro", 198),
(4610, "Sarah", 29),
(72, "Pierre", 29),
(123, "Adil", 692);

1. 簡單query

SELECT id FROM employees;

image

prepare首先會進入Sql_cmd_select::prepare_inner,對於非union的unit會直接調用當前唯一子節點 select的prepare;否則則調用unit->prepare,並且在unit->prepare里會遍歷調用select->prepare

/// @return true for a query expression without UNION or multi-level ORDER
bool SELECT_LEX_UNIT::is_simple() const { return !(is_union() || fake_select_lex); }


// bool Sql_cmd_select::prepare_inner(THD *thd)
if (unit->is_simple()) {
    // unit的子節點只有一個select (此處不排斥select可能有subquery)
    SELECT_LEX *const select = unit->first_select();
    select->context.resolve_in_select_list = true;
    select->set_query_result(result);
    select->make_active_options(0, 0);
    select->fields_list = select->item_list;

    if (select->prepare(thd)) return true;

    unit->set_prepared();
  } else {
    if (unit->prepare(thd, result, SELECT_NO_UNLOCK, 0)) return true;
  }

execute會先進入Sql_cmd_dml::execute_inner,SELECT_LEX沒有execute方法,直接調用join->exec(); unit->execute里會調用其子select的join->exec()。

/**
  Execute a DML statement.
  This is the default implementation for a DML statement and uses a
  nested-loop join processor per outer-most query block.
  The implementation is split in two: One for query expressions containing
  a single query block and one for query expressions containing multiple
  query blocks combined with UNION.
*/
bool Sql_cmd_dml::execute_inner(THD *thd) {
  SELECT_LEX_UNIT *unit = lex->unit;

  // optimize
  if (unit->is_simple()) {
    if (unit->set_limit(thd, unit->global_parameters()))
      return true; /* purecov: inspected */
    if (unit->first_select()->optimize(thd)) return true;

    unit->set_optimized();
  } else {
    if (unit->optimize(thd)) return true;
  }

  // explain or execute
  if (lex->is_explain()) {
    if (explain_query(thd, unit)) return true; /* purecov: inspected */
  } else {
    if (unit->is_simple()) {
      unit->first_select()->join->exec();
      unit->set_executed();
      if (thd->is_error()) return true;
    } else {
      if (unit->execute(thd)) return true;
    }
  }

  return false;
}


2. union query

SELECT id from employees UNION SELECT manager_id from employees;
// in exec: SELECT id FROM (SELECT id from employees UNION SELECT manager_id from employees);

image

這一次進入上一節中的!unit->is_simple()的分支執行unit->prepare。但當兩個子查詢prepare完后,unit->prepare里會進入unit->prepare_fake_select_lex (fake_select_lex參考上面的成員變量說明) ,即實際執行時query會變成注釋里的query,最外層的那個SELECT是mysql加上去的,稱之為fake_select_lex。而fake_select_lex通常只是加個select從union臨時表里取數,故不會有GROUP BY、WHERE、HAVING等問題。

execute階段包括fake_select_lex的三個查詢路徑都是一樣的:Sql_cmd_dml::execute_inner!unit->is_simple()的分支。unit->optimize對真實的子節點 select和fake_select_lex是兩套代碼,但實際邏輯是一樣的。

// three steps to optimize a select in SELECT_LEX_UNIT (including fake_select_lex)
thd->lex->set_current_select(select);
if (set_limit(thd, select)) DBUG_RETURN(true); 
if (select->optimize(thd)) DBUG_RETURN(true);

3. subquery in WHERE clause

SELECT id FROM employees WHERE id IN (SELECT manager_id FROM employees);

image

prepare階段最外層select和上述介紹一致。subselect由於是處於WHERE中的,因此會轉化成Item來表示,對應的對象是Item_subselect(Item_subselect下文會介紹細節)。因此WHERE中的subselect並不會在以上文與unit一起形成的hierachy structure存在,而是直接在parse直接傳個subselect來自己管理。故subselect的prepare和execute都是受Item_subselect及其subselect_engine所調用。

題外話:

上面explain中的物化是因為manager_id沒有索引,可以對比下面的explain output。因此IN在某些場景下會被優化成semi-join,因為和join的優化思路是一樣的。

image


4. subquery in FROM clause

SELECT iid FROM (SELECT id + 10 AS iid, name FROM employees) t1;

image

(默認會有一個提升到外層消除subquery derive table的優化)

prepare階段最外層select和上述介紹一致。這里的subselect會生成一個derived_table,所有的derived_table的subselect都會走如下圖的調用鏈去prepare。

image

execute階段最外層select和上述一致,執行到TABLE_LIST::materialize_derived的時候會直接調用相應的execute方法(和prepare類似,union的走unit->execute,否則直接join->exec)去生成derive table。

image


5. Recursive CTE

WITH RECURSIVE employee_paths (id, name, path) AS
(
  SELECT id, name, CAST(id AS CHAR(200))
    FROM employees
    WHERE manager_id IS NULL
  UNION ALL
  SELECT e.id, e.name, CONCAT(ep.path, ',', e.id)
    FROM employee_paths AS ep JOIN employees AS e
      ON ep.id = e.manager_id
)
SELECT * FROM employee_paths ORDER BY path;

image

CTE介紹參見下文Common Table Expression小節。

prepare階段最外層select和上述介紹一致。CTE會當成derive table去處理,因此和FROM subquery一樣走TABLE_LIST::resolve_derived去調用SELECT_LEX::prepare。上面sql中共有三個會生成derived_table的sql,除了UNION的兩個子查詢外,第三個是union本身的查詢。

execute階段會在QEP_TAB::prepare_scan中預先生成物化的表(即CTE表),即explain中的step2,然后在exec里遞歸執行兩個step3的UNION。

遞歸執行邏輯分散在sql_union.cc的Recursive_executor和 sql_executor.cc: sub_select的recursive判斷中。對於示例sql A UNION B,recursive的發生在B中,執行順序為先執行A結果寫進臨時表,執行B的時候employee_paths已經相當於有數據了。如此遞歸直至沒有新的數據寫入臨時表(代碼參見Recursive_executor::more_iterations()的if (row_count == new_row_count)判斷)。


Item_subselect

子查詢的執行入口在Item_subselect::val_int。

Item_subselect有如下繼承關系的派生類:

  • Item_singlerow_subselect 。實際上應該叫Item_singlevalue_subselect,指返回值為一個常量的subselect。

    • Item_maxmin_subselect 。實際是服務於ALL/ANY的rewrite的。

    •  /*
          If this is an ALL/ANY single-value subquery predicate, try to rewrite
          it with a MIN/MAX subquery.
        
          E.g. SELECT * FROM t1 WHERE b > ANY (SELECT a FROM t2) can be rewritten
          with SELECT * FROM t1 WHERE b > (SELECT MIN(a) FROM t2).
        
          A predicate may be transformed to use a MIN/MAX subquery if it:
          1. has a greater than/less than comparison operator, and
          2. is not correlated with the outer query, and
          3. UNKNOWN results are treated as FALSE, or can never be generated, and
        */
      
  • Item_exists_subselect 。exists子查詢的subselect,有可能會被如下方法處理:轉換成semijoin、materialization、exists。

    • Item_in_subselect 。in子查詢的subselect。
      • Item_allany_subselect 。ALL/ANY/SOME subselect.

Item_subselect里會有個subselect_engine(簡稱engine),代表subselect的實際執行邏輯。subselect_engine有如下派生類:

  • subselect_single_select_engine . 這里的single 指的是 single table,與union相對。exec的時候會直接執行JOIN的exec。
  • subselect_union_engine . exec會調用SELECT_LEX_UNIT的exec,按順序將UNION的每個select的JOIN都exec。
  • subselect_indexquery_engine . 當IN subselect里的col是索引時可以使用索引查詢。
    • subselect_hash_sj_engine . Hash semi-join exec for IN predicate.

Item_subselect和subselect_engine的關系:相互都有對方的指針作為成員變量,但邏輯上是Item_subselect包含subselect_engine並負責其生命周期。

// Prepare phase:
Used inside Item_subselect::fix_fields() according to this scenario:
> Item_subselect::fix_fields {
	> engine->prepare {
		> query_block->prepare {
      (Here we realize we need to do the rewrite and set
       substitution= some new Item, eg. Item_in_optimizer )
    }
  }
  *ref= substitution;
}

// Exec phase:
> Item_subselect::val_int {
	>	Item_subselect::exec() {
    // 對於 Item_in_subselect,還會先計算left_expr的值。 (left_expr IN (subselect))
    
		> SELECT_LEX_UNIT::optimize() { // iterate
			> SELECT_LEX::optimize() {
				JOIN::optimize();
      }
    }
		> engine->exec() {
      // subselect_single_select_engine
      JOIN::exec();
      
      // subselect_union engine  
      > SELECT_LEX_UNIT::exec() { // iterate
				> SELECT_LEX::exec() {  
          	JOIN::exec();
        }
      }
      
      // subselect_indexquery_engine
      query table by index by ha_index_read_map();
      
      // subselect_hash_sj_engine
      materialize_if_not_materialized(); // 內部是個 single_select_engine在做物化的exec
      subselect_indexsubquery_engine::exec(); //對物化表進行索引讀
    }
  }
}

// Cleanup phase
> Item_subselect::cleanup {
  engine->cleanup();
}

Common Table Expression (CTE)

Ref: WITH syntax https://dev.mysql.com/doc/refman/8.0/en/with.html

CTE與subquery的不同是,CTE會提前物化成derive table,然后可以被多次或遞歸使用;subquery則是在遇到的時候才去做處理(可能是物化、消除、semi-join等),因此同樣的subquery如果被多個地方調用,會有可能多次物化。CTE和view或臨時表不同的是,CTE還是單個query下的概念,視圖或臨時表是可以跨query的。

CTE的表示類為table.h:Common_table_expr

/**
  After parsing, a Common Table Expression is accessed through a
  TABLE_LIST. This class contains all information about the CTE which the
  TABLE_LIST needs.

  @note that before and during parsing, the CTE is described by a
  PT_common_table_expr.
*/
class Common_table_expr {
 public:
  Common_table_expr(MEM_ROOT *mem_root)
      : references(mem_root), recursive(false), tmp_tables(mem_root) {}
  
  // 按照cte表的格式生成一個新的tmptable
  TABLE *clone_tmp_table(THD *thd, TABLE_LIST *tl);
  
  // clone tmp_tables[0]到sl->tl中,后面會用sl->tl->table來作為cte臨時表的ref
  bool substitute_recursive_reference(THD *thd, SELECT_LEX *sl);
  
  /**
     All references to this CTE in the statement, except those inside the
     query expression defining this CTE.
     In other words, all non-recursive references.
  */
  // 即當前query 的cte部分ref的臨時表(不包括正式表)
  Mem_root_array<TABLE_LIST *> references;
  
  /// True if it's a recursive CTE
  bool recursive;
  
  /**
    List of all TABLE_LISTSs reading/writing to the tmp table created to
    materialize this CTE. Due to shared materialization, only the first one
    has a TABLE generated by create_tmp_table(); other ones have a TABLE
    generated by open_table_from_share().
  */
  // cte相關的tmptable對象的ref,同一個表可能由於recursive會產生多個shared ref
  Mem_root_array<TABLE_LIST *> tmp_tables;
};
執行:

TABLE_LIST::materialize_derived(THD *thd)函數中,會把CTE當成union select來去調用SELECT_LEX_UNIT->execute()去執行。SELECT_LEX_UNIT->execute里會調用Recursive_executor。

Recursive_executor initialize會打開從recursive_reference開始的所有tmp_table。 recursive_reference指的是第一個is_recursive的select (參見TABLE_LIST::resolve_derived),SELECT_LEX 數組的排列是non-recursive的在前,然后recursive的連續排在后。然后SELECT_LEX_UNIT->execute會按順序執行對應SELECT_LEX->join->exec()

執行和遞歸結束條件參見上面的Recursive CTE小節。

CTE其他相關代碼:

  1. SELECT_LEX_UNIT::prepare

  2. // 如果是第一個引用CTE的select(query block),則將第一個select物化成臨時表
    if (sl == first_recursive) {
      // create_result_table() depends on current_select()
      save_select.restore();
      /*
            All next query blocks will read the temporary table, which we must
            thus create now:
          */
      if (derived_table->setup_materialized_derived_tmp_table(thd_arg))
        goto err; /* purecov: inspected */
      thd_arg->lex->set_current_select(sl);
    }
    
    // 如果是遞歸CTE,則將該SELECT_LEX涉及的cte子查詢替換成clone出來的cte tmptable。
    if (sl->recursive_reference)  // Make tmp table known to query block:
      derived_table->common_table_expr()->substitute_recursive_reference(
      thd_arg, sl);
    
  3. sql_tmp_table.cc:create_ondisk_from_heap()

    TABLE_LIST *const wtable_list = wtable->pos_in_table_list;
    Derived_refs_iterator ref_it(wtable_list);
    
    if (wtable_list) {
      Common_table_expr *cte = wtable_list->common_table_expr();
      if (cte) {
        // 查找wtable在整個table數組中的位置
        int i = 0, found = -1;
        TABLE *t;
        while ((t = ref_it.get_next())) {
          if (t == wtable) {
            found = i;
            break;
          }
          ++i;
        }
        DBUG_ASSERT(found >= 0);
        
        if (found > 0)
          // 為什么要把wtable放到最前面先處理呢?
          // 'wtable' is at position 'found', move it to 0 to convert it first
          std::swap(cte->tmp_tables[0], cte->tmp_tables[found]);
        ref_it.rewind();
      }
    }
    


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM