高大上技術之sql解析


Question: 為何sql解析和高大上有關系?
Answer:因為數據庫永遠都是系統的核心,CRUD如此深入碼農的內心。。。如果能把CRUD改造成高大上技術,如此不是造福嘛。。。

CRUD就是Create, Read, Update, Delete,轉換成sql語句就是insert, select, update, delete

普通場景下,insert也就是一個insert了,沒什么高深。。。
高並發場景下,insert就不是一個insert了,而是千千萬萬個insert。。。可以用到的技術有排隊、分表、分區、分倉、緩存同步

普通場景下,select也就是一個select了,沒什么高深。。。
高並發場景下,select就不是一個select了,而是千千萬萬,再千千萬萬個select。。。可以用到的技術有緩存、普通讀寫分離、深入讀寫分離、不鎖、past鎖、還有分表、分區、分倉。。。

你說這么多東西,是全部在一個sql中全部自動化掉好呢,還是讓我們碼農一個一個考慮,再一個一個寫成代碼邏輯的好?


肯定兩種聲音都有,還肯定有第三種聲音。。。所以我還是照着我自己的思路來說吧,你們隨便發揮想象。。。

我要讓一個sql全部解決上面的效果,或者接近上面的效果

如何解決,那就是,以SELECT語句為例

  1. 解析SELECT語句
  2. 解析牽涉到的表、字段、主鍵
  3. 解析是否用到了自己擴展的dsl函數
  4. 找到相應表的分區函數
  5. 找到相應表的緩存配置
  6. 找到dsl函數對應的真實函數
  7. 其他

比如有2個SELECT語句:

  1. SELECT UserID, UserName, Age FROM Users WHERE UserID='某個guid'
  2. SELECT COUNT(1) FROM Users

很簡單的兩句sql,可是Users是個虛擬表,真實表有16個表:Users.[A-F], Users.[0-9],分表策略為根據主鍵ID的第一個字母來分表, 因此:

  • 第一句sql需要先解析where條件中UserID='guid'這個UserID是否為pkid,以及這個'guid'的值,然后根據guid的值調用分表策略函數得到相應的分表后綴,然后用類似下面這個sql來真實查詢:SELECT UserID, UserName, Age FROM [Users.A] WHERE UserID='axxxxx-xxxxx-xxxx-xx'
  • 第二句sql其實是最終變成了16條sql來得到各個分表的count值,然后在程序中累加這些分表的count值

其他:

  • 其他類似緩存、隊列、自定義的擴展函數,都類似於上可以得到解決。


由於只是個demo,所以沒有實現上述全部功能,我們只說下關鍵原理、和代碼。。。

我們用antlr來做詞法解析、語法解析,然后再用tree walker把antlr解析出來的東西轉換為我們要的數據結構,比如:SelectTerms, TableName, WhereClause, OrderByClause等

奧,我們還得寫一個規則文件讓Antlr吃進去,然后antlr就能調用tree walker生成我們要的數據結構了

(大家趕緊補下編譯原理之類的基礎知識以及ANTLR知識) 

grammar SelectSQL;

/*
 * Parser Rules
 */

compileUnit
    :    start
    ;

/*
 * Lexer Rules
 */

WS
    :    [ \t\n\r]+ -> skip
    ;



COMMA:',';
SELECT: 'SELECT';
STAR:'*';
FROM:'FROM';
WHERE:'WHERE';
ORDERBY:'ORDER BY';
DIRECTION:'ASC'|'DESC';
CHAR: 'a'..'z'|'A'..'Z';
NUM: '0'..'9';
STRING:'\'' .*? '\'';
LB:'(';
RB:')';
LBRACE:'[';
RBRACE:']';
CONDITIONS_OPERATOR    
    :'AND'
    |'OR'
    ;
CONDITION_OPERATOR    
    :'='
    |'>'
    |'<'
    |'<>'
    |'!='
    |'>='
    |'<='
    ;
FCOUNT:'COUNT';


start
    :statement_list
    ;

statement_list
    :statement statement*
    ;

statement
    :selectStatement
    ;


selectStatement
    :selectStmt fromStmt whereStmt? orderbyStmt?
    ;

selectStmt
    :SELECT columns
    ;

columns
    :column (COMMA column)*
    ;

column
    : identifier
    | LBRACE identifier RBRACE
    | functionStmt
    | STAR
    ;

functionStmt
    :function LB (parameters) RB
    ;
    
function
    :FCOUNT
    ;

parameters
    : parameter (COMMA parameter)*
    ;

parameter
    : identifier
    | integer
    | string
    | STAR
    ;

fromStmt
    :FROM table
    ;

table
    : identifier
    | LBRACE identifier RBRACE
    ;


whereStmt
    : WHERE conditions
    ;
    
conditions
    : condition (CONDITIONS_OPERATOR condition)* 
    ;

condition
    :left CONDITION_OPERATOR right
    ;
    
left
    : parameter
    ;
    
right
    : parameter
    ;
    
orderbyStmt
    :ORDERBY sortStmt
    ;

sortStmt
    : sortCondition (COMMA sortCondition)*
    ;
    
sortCondition
    :sortColumn DIRECTION
    ;

sortColumn
    : identifier
    | LBRACE identifier RBRACE
    ;

identifier
    :CHAR (CHAR|NUM)*
    ;
integer
    :NUM+
    ;
string
    : STRING
    ;

 

 真心呼喚廣大開發人員深入編譯原理之類的基礎技術!

在eclipse中輸入解析sql文本后,會被解析成tree

 開源世界真強大啊,有yacc, flex, bison, antlr這些現成的解析工具。

 我們先在eclipse中把規則測試通過后,再把這個.g4規則文件拷貝到我們的visual studio中,如下:

然后只要這個g4文件一保存,antlr的vs插件就會自動根據規則文件生成相關名稱的詞法解析類、文法解析類、以及我們即將要改寫的TreeListener

SelectSQLBaseListener:就是antlr插件自動生成的抽象類,我們的改動都是基於這個類,來做override改寫(針對規則的enter/exit) 

EnterXXXXX/ExitXXXX: 對應規則文件中的規則名稱,Enter/Exit代表進入規則以及離開規則之前的行為動作

 

demo控制台程序運行輸出效果:

輸入SQL:

                    SELECT * FROM users
                    SELECT userId, userName FROM users
                    SELECT COUNT(1) FROM users
                    SELECT COUNT(*) FROM users
                    SELECT userId, userName FROM users ORDER BY userName DESC
                    SELECT userId, userName FROM users WHERE userId='1212121' ORDER BY userName DESC

輸出SQL:
          select * from [users.0]
          select * from [users.1]
          select * from [users.2]
          select * from [users.3]
          select * from [users.4]
          select * from [users.5]
          select * from [users.6]
          select * from [users.7]
          select * from [users.8]
          select * from [users.9]
          select * from [users.a]
          select * from [users.b]
          select * from [users.c]
          select * from [users.d]
          select * from [users.e]
          select * from [users.f]
          select userId, userName from [users.0]
          select userId, userName from [users.1]
          select userId, userName from [users.2]
          select userId, userName from [users.3]
          select userId, userName from [users.4]
          select userId, userName from [users.5]
          select userId, userName from [users.6]
          select userId, userName from [users.7]
          select userId, userName from [users.8]
          select userId, userName from [users.9]
          select userId, userName from [users.a]
          select userId, userName from [users.b]
          select userId, userName from [users.c]
          select userId, userName from [users.d]
          select userId, userName from [users.e]
          select userId, userName from [users.f]
          select COUNT(1) from [users.0]
          select COUNT(1) from [users.1]
          select COUNT(1) from [users.2]
          select COUNT(1) from [users.3]
          select COUNT(1) from [users.4]
          select COUNT(1) from [users.5]
          select COUNT(1) from [users.6]
          select COUNT(1) from [users.7]
          select COUNT(1) from [users.8]
          select COUNT(1) from [users.9]
          select COUNT(1) from [users.a]
          select COUNT(1) from [users.b]
          select COUNT(1) from [users.c]
          select COUNT(1) from [users.d]
          select COUNT(1) from [users.e]
          select COUNT(1) from [users.f]
          select COUNT(*) from [users.0]
          select COUNT(*) from [users.1]
          select COUNT(*) from [users.2]
          select COUNT(*) from [users.3]
          select COUNT(*) from [users.4]
          select COUNT(*) from [users.5]
          select COUNT(*) from [users.6]
          select COUNT(*) from [users.7]
          select COUNT(*) from [users.8]
          select COUNT(*) from [users.9]
          select COUNT(*) from [users.a]
          select COUNT(*) from [users.b]
          select COUNT(*) from [users.c]
          select COUNT(*) from [users.d]
          select COUNT(*) from [users.e]
          select COUNT(*) from [users.f]
          select userId, userName from [users.0]  order by userName DESC
          select userId, userName from [users.1]  order by userName DESC
          select userId, userName from [users.2]  order by userName DESC
          select userId, userName from [users.3]  order by userName DESC
          select userId, userName from [users.4]  order by userName DESC
          select userId, userName from [users.5]  order by userName DESC
          select userId, userName from [users.6]  order by userName DESC
          select userId, userName from [users.7]  order by userName DESC
          select userId, userName from [users.8]  order by userName DESC
          select userId, userName from [users.9]  order by userName DESC
          select userId, userName from [users.a]  order by userName DESC
          select userId, userName from [users.b]  order by userName DESC
          select userId, userName from [users.c]  order by userName DESC
          select userId, userName from [users.d]  order by userName DESC
          select userId, userName from [users.e]  order by userName DESC
          select userId, userName from [users.f]  order by userName DESC
          select userId, userName from [users.0] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.1] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.2] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.3] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.4] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.5] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.6] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.7] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.8] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.9] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.a] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.b] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.c] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.d] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.e] WHERE userId='1212121' order by userName DESC
          select userId, userName from [users.f] WHERE userId='1212121' order by userName DESC

 

希望大家能對基礎技術真正感興趣,趕緊學習編譯原理、antlr吧。

 

很抱歉沒能提供詳細原理說明,大家baidubaidu就都有了。

 

代碼下載 http://files.cnblogs.com/files/aarond/SQLParser_Select.rar

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM