Question: 為何sql解析和高大上有關系?
Answer:因為數據庫永遠都是系統的核心,CRUD如此深入碼農的內心。。。如果能把CRUD改造成高大上技術,如此不是造福嘛。。。
CRUD就是Create, Read, Update, Delete,轉換成sql語句就是insert, select, update, delete
普通場景下,insert也就是一個insert了,沒什么高深。。。
高並發場景下,insert就不是一個insert了,而是千千萬萬個insert。。。可以用到的技術有排隊、分表、分區、分倉、緩存同步
普通場景下,select也就是一個select了,沒什么高深。。。
高並發場景下,select就不是一個select了,而是千千萬萬,再千千萬萬個select。。。可以用到的技術有緩存、普通讀寫分離、深入讀寫分離、不鎖、past鎖、還有分表、分區、分倉。。。
你說這么多東西,是全部在一個sql中全部自動化掉好呢,還是讓我們碼農一個一個考慮,再一個一個寫成代碼邏輯的好?
肯定兩種聲音都有,還肯定有第三種聲音。。。所以我還是照着我自己的思路來說吧,你們隨便發揮想象。。。
我要讓一個sql全部解決上面的效果,或者接近上面的效果
如何解決,那就是,以SELECT語句為例
- 解析SELECT語句
- 解析牽涉到的表、字段、主鍵
- 解析是否用到了自己擴展的dsl函數
- 找到相應表的分區函數
- 找到相應表的緩存配置
- 找到dsl函數對應的真實函數
- 其他
比如有2個SELECT語句:
- SELECT UserID, UserName, Age FROM Users WHERE UserID='某個guid'
- SELECT COUNT(1) FROM Users
很簡單的兩句sql,可是Users是個虛擬表,真實表有16個表:Users.[A-F], Users.[0-9],分表策略為根據主鍵ID的第一個字母來分表, 因此:
- 第一句sql需要先解析where條件中UserID='guid'這個UserID是否為pkid,以及這個'guid'的值,然后根據guid的值調用分表策略函數得到相應的分表后綴,然后用類似下面這個sql來真實查詢:SELECT UserID, UserName, Age FROM [Users.A] WHERE UserID='axxxxx-xxxxx-xxxx-xx'
- 第二句sql其實是最終變成了16條sql來得到各個分表的count值,然后在程序中累加這些分表的count值
其他:
- 其他類似緩存、隊列、自定義的擴展函數,都類似於上可以得到解決。
由於只是個demo,所以沒有實現上述全部功能,我們只說下關鍵原理、和代碼。。。
我們用antlr來做詞法解析、語法解析,然后再用tree walker把antlr解析出來的東西轉換為我們要的數據結構,比如:SelectTerms, TableName, WhereClause, OrderByClause等
奧,我們還得寫一個規則文件讓Antlr吃進去,然后antlr就能調用tree walker生成我們要的數據結構了
(大家趕緊補下編譯原理之類的基礎知識以及ANTLR知識)
grammar SelectSQL; /* * Parser Rules */ compileUnit : start ; /* * Lexer Rules */ WS : [ \t\n\r]+ -> skip ; COMMA:','; SELECT: 'SELECT'; STAR:'*'; FROM:'FROM'; WHERE:'WHERE'; ORDERBY:'ORDER BY'; DIRECTION:'ASC'|'DESC'; CHAR: 'a'..'z'|'A'..'Z'; NUM: '0'..'9'; STRING:'\'' .*? '\''; LB:'('; RB:')'; LBRACE:'['; RBRACE:']'; CONDITIONS_OPERATOR :'AND' |'OR' ; CONDITION_OPERATOR :'=' |'>' |'<' |'<>' |'!=' |'>=' |'<=' ; FCOUNT:'COUNT'; start :statement_list ; statement_list :statement statement* ; statement :selectStatement ; selectStatement :selectStmt fromStmt whereStmt? orderbyStmt? ; selectStmt :SELECT columns ; columns :column (COMMA column)* ; column : identifier | LBRACE identifier RBRACE | functionStmt | STAR ; functionStmt :function LB (parameters) RB ; function :FCOUNT ; parameters : parameter (COMMA parameter)* ; parameter : identifier | integer | string | STAR ; fromStmt :FROM table ; table : identifier | LBRACE identifier RBRACE ; whereStmt : WHERE conditions ; conditions : condition (CONDITIONS_OPERATOR condition)* ; condition :left CONDITION_OPERATOR right ; left : parameter ; right : parameter ; orderbyStmt :ORDERBY sortStmt ; sortStmt : sortCondition (COMMA sortCondition)* ; sortCondition :sortColumn DIRECTION ; sortColumn : identifier | LBRACE identifier RBRACE ; identifier :CHAR (CHAR|NUM)* ; integer :NUM+ ; string : STRING ;
真心呼喚廣大開發人員深入編譯原理之類的基礎技術!
在eclipse中輸入解析sql文本后,會被解析成tree
開源世界真強大啊,有yacc, flex, bison, antlr這些現成的解析工具。
我們先在eclipse中把規則測試通過后,再把這個.g4規則文件拷貝到我們的visual studio中,如下:
然后只要這個g4文件一保存,antlr的vs插件就會自動根據規則文件生成相關名稱的詞法解析類、文法解析類、以及我們即將要改寫的TreeListener
SelectSQLBaseListener:就是antlr插件自動生成的抽象類,我們的改動都是基於這個類,來做override改寫(針對規則的enter/exit)
EnterXXXXX/ExitXXXX: 對應規則文件中的規則名稱,Enter/Exit代表進入規則以及離開規則之前的行為動作
demo控制台程序運行輸出效果:
輸入SQL: SELECT * FROM users SELECT userId, userName FROM users SELECT COUNT(1) FROM users SELECT COUNT(*) FROM users SELECT userId, userName FROM users ORDER BY userName DESC SELECT userId, userName FROM users WHERE userId='1212121' ORDER BY userName DESC 輸出SQL: select * from [users.0] select * from [users.1] select * from [users.2] select * from [users.3] select * from [users.4] select * from [users.5] select * from [users.6] select * from [users.7] select * from [users.8] select * from [users.9] select * from [users.a] select * from [users.b] select * from [users.c] select * from [users.d] select * from [users.e] select * from [users.f] select userId, userName from [users.0] select userId, userName from [users.1] select userId, userName from [users.2] select userId, userName from [users.3] select userId, userName from [users.4] select userId, userName from [users.5] select userId, userName from [users.6] select userId, userName from [users.7] select userId, userName from [users.8] select userId, userName from [users.9] select userId, userName from [users.a] select userId, userName from [users.b] select userId, userName from [users.c] select userId, userName from [users.d] select userId, userName from [users.e] select userId, userName from [users.f] select COUNT(1) from [users.0] select COUNT(1) from [users.1] select COUNT(1) from [users.2] select COUNT(1) from [users.3] select COUNT(1) from [users.4] select COUNT(1) from [users.5] select COUNT(1) from [users.6] select COUNT(1) from [users.7] select COUNT(1) from [users.8] select COUNT(1) from [users.9] select COUNT(1) from [users.a] select COUNT(1) from [users.b] select COUNT(1) from [users.c] select COUNT(1) from [users.d] select COUNT(1) from [users.e] select COUNT(1) from [users.f] select COUNT(*) from [users.0] select COUNT(*) from [users.1] select COUNT(*) from [users.2] select COUNT(*) from [users.3] select COUNT(*) from [users.4] select COUNT(*) from [users.5] select COUNT(*) from [users.6] select COUNT(*) from [users.7] select COUNT(*) from [users.8] select COUNT(*) from [users.9] select COUNT(*) from [users.a] select COUNT(*) from [users.b] select COUNT(*) from [users.c] select COUNT(*) from [users.d] select COUNT(*) from [users.e] select COUNT(*) from [users.f] select userId, userName from [users.0] order by userName DESC select userId, userName from [users.1] order by userName DESC select userId, userName from [users.2] order by userName DESC select userId, userName from [users.3] order by userName DESC select userId, userName from [users.4] order by userName DESC select userId, userName from [users.5] order by userName DESC select userId, userName from [users.6] order by userName DESC select userId, userName from [users.7] order by userName DESC select userId, userName from [users.8] order by userName DESC select userId, userName from [users.9] order by userName DESC select userId, userName from [users.a] order by userName DESC select userId, userName from [users.b] order by userName DESC select userId, userName from [users.c] order by userName DESC select userId, userName from [users.d] order by userName DESC select userId, userName from [users.e] order by userName DESC select userId, userName from [users.f] order by userName DESC select userId, userName from [users.0] WHERE userId='1212121' order by userName DESC select userId, userName from [users.1] WHERE userId='1212121' order by userName DESC select userId, userName from [users.2] WHERE userId='1212121' order by userName DESC select userId, userName from [users.3] WHERE userId='1212121' order by userName DESC select userId, userName from [users.4] WHERE userId='1212121' order by userName DESC select userId, userName from [users.5] WHERE userId='1212121' order by userName DESC select userId, userName from [users.6] WHERE userId='1212121' order by userName DESC select userId, userName from [users.7] WHERE userId='1212121' order by userName DESC select userId, userName from [users.8] WHERE userId='1212121' order by userName DESC select userId, userName from [users.9] WHERE userId='1212121' order by userName DESC select userId, userName from [users.a] WHERE userId='1212121' order by userName DESC select userId, userName from [users.b] WHERE userId='1212121' order by userName DESC select userId, userName from [users.c] WHERE userId='1212121' order by userName DESC select userId, userName from [users.d] WHERE userId='1212121' order by userName DESC select userId, userName from [users.e] WHERE userId='1212121' order by userName DESC select userId, userName from [users.f] WHERE userId='1212121' order by userName DESC
希望大家能對基礎技術真正感興趣,趕緊學習編譯原理、antlr吧。
很抱歉沒能提供詳細原理說明,大家baidubaidu就都有了。
代碼下載 http://files.cnblogs.com/files/aarond/SQLParser_Select.rar