Phoenix(sql on hbase)簡單介紹
介紹:
Phoenix is a SQL skin over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.
部署:
1:wget http://phoenix-bin.github.com/client/phoenix-2.2.1-install.tar,將jar包拷貝至HBASE_HOME/lib就可以
2:運行psql.sh localhost ../examples/web_stat.sql ../examples/web_stat.csv ../examples/web_stat_queries.sql,載入演示樣例數據
3:sqlline.sh localhost(zookeeper地址)進入命令行client
相關文檔:
wiki主頁(文檔非常具體):
https://github.com/forcedotcom/phoenix/wiki
Quick Start
https://github.com/forcedotcom/phoenix/wiki/Phoenix-in-15-minutes-or-less
Recently Implemented Features
https://github.com/forcedotcom/phoenix/wiki/Recently-Implemented-Features
Phoenix Performance vs Hive,Impala
https://github.com/forcedotcom/phoenix/wiki/Performance#salting
官方實時性能測試結果:
http://phoenix-bin.github.io/client/performance/latest.htm
語法:
http://forcedotcom.github.io/phoenix/index.html
二級索引相關(索引的使用須要調用Phoenix API):
二級索引(多列時)使用須要在hbase-site.xml中增加例如以下配置
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
創建索引樣例:
create table usertable (id varchar primary key, firstname varchar, lastname varchar);
create index idx_name on usertable (firstname) include (lastname);
可通過例如以下方法查看當前SQL語句索引是否生效:
explain select id, firstname, lastname from usertable where firstname = 'foo';
explain的相關說明:
RANGE SCAN means that only a subset of the rows in your table will be scanned over. This occurs if you use one or more leading columns from your primary key constraint. Query that is not filtering on leading PK columns ex. select * from test where pk2='x' and pk3='y'; will result in full scan whereas the following query will result in range scan select * from test where pk1='x' and pk2='y';. Note that you can add a secondary index on your "pk2" and "pk3" columns and that would cause a range scan to be done for the first query (over the index table).
DEGENERATE SCAN means that a query can't possibly return any rows. If we can determine that at compile time, then we don't bother to even run the scan.
FULL SCAN means that all rows of the table will be scanned over (potentially with a filter applied if you have a WHERE clause)
SKIP SCAN means that either a subset or all rows in your table will be scanned over, however it will skip large groups of rows depending on the conditions in your filter. See this blog for more detail. We don't do a SKIP SCAN if you have no filter on the leading primary key columns, but you can force a SKIP SCAN by using the /*+ SKIP_SCAN */ hint. Under some conditions, namely when the cardinality of your leading primary key columns is low, it will be more efficient than a FULL SCAN.
索引使用介紹:
- 主鍵索引:主鍵索引要按創建時的順序引用。如primary key(id,name,add),那么會隱式的創建(id),(id,name),(id,name,add)三個索引,假設在where中用這三個條件會用到索引,其它組合則無法使用索引(FULL SCAN)。
- 二級索引:除了要按創建時的順序引用外,假設查詢的列不全在索引或者覆蓋索引中則無法使用索引。
舉例:
DDL:create table usertable (id varchar primary key, firstname varchar, lastname varchar);
create index idx_name on usertable (firstname);
DML:select id, firstname, lastname from usertable where firstname = 'foo';
此查詢不會使用到索引,由於lastname不再索引中。
運行DDL:create idx_name on usertable (firstname) include (lastname)后該查詢語句才干使用索引。
遺留問題:include和on在Phoenix中詳細有什么差別?
- 查詢條件中主鍵索引+二級索引同一時候存在的話,Phoenix會自己選擇最優索引。
Phoenix的SQL表結構與Hbase結構的映射實驗
>>create table user3table (id varchar, firstname varchar, lastname varchar CONSTRAINT PK PRIMARY KEY (id,firstname));
>>!describe user3table
+------------+-------------+------------+-------------+-----------+------------+-------------+---------------+----------------+----------------+------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULL |
+------------+-------------+------------+-------------+-----------+------------+-------------+---------------+----------------+----------------+------+
| null | null | USER3TABLE | ID | 12 | VARCHAR | null | null | null | null | 1 |
| null | null | USER3TABLE | FIRSTNAME | 12 | VARCHAR | null | null | null | null | 1 |
| _0 | null | USER3TABLE | LASTNAME | 12 | VARCHAR | null | null | null | null | 1 |
+------------+-------------+------------+-------------+-----------+------------+-------------+---------------+----------------+----------------+------+
>>!index user3table;
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+------------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | NON_UNIQUE | INDEX_QUALIFIER | INDEX_NAME | TYPE | ORDINAL_POSITION | COLUMN_NAME | ASC_OR_DESC | CARDINALIT |
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+------------+
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+------------+
>>select * from user3table;
+------------+------------+------------+
| ID | FIRSTNAME | LASTNAME |
+------------+------------+------------+
| hup | zhan | feng |
+------------+------------+------------+
>>hbase>>scan 'USER3TABLE'
ROW COLUMN+CELL
hup\x00zhan column=_0:LASTNAME, timestamp=1387875092585, value=feng
hup\x00zhan column=_0:_0, timestamp=1387875092585, value=
1 row(s) in 0.0110 seconds
>>create index idx_test on user3table (firstname) include (lastname);
>>!index user3table;
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+------------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | NON_UNIQUE | INDEX_QUALIFIER | INDEX_NAME | TYPE | ORDINAL_POSITION | COLUMN_NAME | ASC_OR_DESC | CARDINALIT |
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+------------+
| null | null | USER3TABLE | true | null | IDX_TEST | 3 | 1 | :FIRSTNAME | A | null |
| null | null | USER3TABLE | true | null | IDX_TEST | 3 | 2 | :ID | A | null |
| null | null | USER3TABLE | true | null | IDX_TEST | 3 | 3 | _0:LASTNAME | null | null |
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+------------+
>>select * from user3table;
+------------+------------+------------+
| ID | FIRSTNAME | LASTNAME |
+------------+------------+------------+
| hup | zhan | feng |
+------------+------------+------------+
>>hbase>>scan 'USER3TABLE'
ROW COLUMN+CELL
hup\x00zhan column=_0:LASTNAME, timestamp=1387875092585, value=feng
hup\x00zhan column=_0:_0, timestamp=1387875092585, value=
1 row(s) in 0.0110 seconds
此外:當表中非主鍵的列有多個時會統一加后綴:
1:NASalesforce.com\x00Login\x00\x00\x00 column=STATS:ACTIVE_VISITOR, timestamp=1387867968156, value=\x80\x00\x1A"
\x01C%\x17\xFE0
2:NASalesforce.com\x00Login\x00\x00\x00 column=USAGE:CORE, timestamp=1387867968156, value=\x80\x00\x00\x00\x00\x00\x00\xC9
\x01C%\x17\xFE0
3:NASalesforce.com\x00Login\x00\x00\x00 column=USAGE:DB, timestamp=1387867968156, value=\x80\x00\x00\x00\x00\x00\x02\x84
\x01C%\x17\xFE0
4:NASalesforce.com\x00Login\x00\x00\x00 column=USAGE:_0, timestamp=1387867968156, value=
\x01C%\x17\xFE0
結論:
1:Phoenix會把“CONSTRAINT PK PRIMARY KEY (id,firstname)”這樣定義的列拼起來增加到Hbase主鍵中(用\x00進行切割),同一時候將聯合主鍵涉及到的列合並默認名為"_0"的一列。其值為空。其它列放入Hbase的同名列中存儲
2:Phoenix在Hbase中維護了一張系統表(SYSTEM TABLE)來存儲相關Phoenix表的scheme元數據。
3:創建二級索引(create index)操作不會影響表結構
4:假設建表時不指定列族,則列族以_0、_1的方式命名
5:假設有多列時value值通過HBase接口獲取的並非直接可用的值(僅僅能通過Phoenix接口獲取正常值)
動態scheme相關
1:支持改動列
Example:
ALTER TABLE my_schema.my_table ADD d.dept_id char(10) VERSIONS=10
ALTER TABLE my_table ADD dept_name char(50)
ALTER TABLE my_table ADD parent_id char(15) null primary key
ALTER TABLE my_table DROP COLUMN d.dept_id
ALTER TABLE my_table DROP COLUMN dept_name
ALTER TABLE my_table DROP COLUMN parent_id
ALTER TABLE my_table SET IMMUTABLE_ROWS=true
2:支持改動二級索引
Example:
CREATE INDEX my_idx ON sales.opportunity(last_updated_date DESC)
CREATE INDEX my_idx ON log.event(created_date DESC) INCLUDE (name, payload) SALT_BUCKETS=10
CREATE INDEX IF NOT EXISTS my_comp_idx ON server_metrics ( gc_time DESC, created_date DESC )
DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)
ALTER INDEX my_idx ON sales.opportunity DISABLE
ALTER INDEX IF EXISTS my_idx ON server_metrics REBUILD
DROP INDEX my_idx ON sales.opportunity
DROP INDEX IF EXISTS my_idx ON server_metrics
3:應該是不支持改動主鍵索引(沒找到相關信息。理論上也不好支持,由於主鍵索引就是rowkey的值。)
Javaclient演示樣例代碼(直接面向JDBC接口編程):
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.PreparedStatement;
import java.sql.Statement;
public class test {
public static void main(String[] args) throws SQLException {
Statement stmt = null;
ResultSet rset = null;
Connection con = DriverManager.getConnection("jdbc:phoenix:zookeeper");
stmt = con.createStatement();
stmt.executeUpdate("create table test (mykey integer not null primary key, mycolumn varchar)");
stmt.executeUpdate("upsert into test values (1,'Hello')");
stmt.executeUpdate("upsert into test values (2,'World!')");
con.commit();
PreparedStatement statement = con.prepareStatement("select * from test");
rset = statement.executeQuery();
while (rset.next()) {
System.out.println(rset.getString("mycolumn"));
}
statement.close();
con.close();
}
}
單節點測試:
建表:
CREATE TABLE IF NOT EXISTS $table (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL,
FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB BIGINT,STATS.ACTIVE_VISITOR
INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, FEATURE, DATE))
SPLIT ON ('CSGoogle','CSSalesforce','EUApple','EUGoogle','EUSalesforce','NAApple','NAGoogle','NASalesforce');
performance_10000000數據表中是1000W條例如以下數據:
+------+------------+------------+---------------------+----------+----------+----------------+
| HOST | DOMAIN | FEATURE | DATE | CORE | DB | ACTIVE_VISITOR |
+------+------------+------------+---------------------+----------+----------+----------------+
| CS | Apple.com | Dashboard | 2013-12-23 | 363 | 795 | 8390 |
+------+------------+------------+---------------------+----------+----------+----------------+
Query # 1 - Count - SELECT COUNT(1) FROM performance_10000000;
COUNT(1)
--------
10000000
Time: 66.044 sec(s)
Query # 2 - Group By First PK - SELECT HOST FROM performance_10000000 GROUP BY HOST;
HOST
----
CS
EU
NA
Time: 51.43 sec(s)
Query # 3 - Group By Second PK - SELECT DOMAIN FROM performance_10000000 GROUP BY DOMAIN;
DOMAIN
----------
Apple.com
Google.com
Salesforce.com
Time: 46.908 sec(s)
Query # 4 - Truncate + Group By - SELECT TRUNC(DATE,'DAY') DAY FROM performance_10000000 GROUP BY TRUNC(DATE,'DAY');
DAY
-------------------
2013-12-23 00:00:00
2013-12-24 00:00:00
2013-12-25 00:00:00
......
Time: 48.132 sec(s)
Query # 5 - Filter + Count - SELECT COUNT(1) FROM performance_10000000 WHERE CORE<10;
COUNT(1)
--------
198669
Time: 31.301 sec(s)
集群(5*RegionServer)測試:
performance_10000000數據表中是1000W條例如以下數據:
+------+------------+------------+---------------------+----------+----------+----------------+
| HOST | DOMAIN | FEATURE | DATE | CORE | DB | ACTIVE_VISITOR |
+------+------------+------------+---------------------+----------+----------+----------------+
| CS | Apple.com | Dashboard | 2013-12-23 | 363 | 795 | 8390 |
+------+------------+------------+---------------------+----------+----------+----------------+
Query # 1 - Count - SELECT COUNT(1) FROM performance_10000000;
COUNT(1)
--------
19630614
Time: 13.879 sec(s)
Query # 2 - Group By First PK - SELECT HOST FROM performance_10000000 GROUP BY HOST;
HOST
----
CS
EU
NA
Time: 13.545 sec(s)
Query # 3 - Group By Second PK - SELECT DOMAIN FROM performance_10000000 GROUP BY DOMAIN;
DOMAIN
----------
Apple.com
Google.com
Salesforce.com
Time: 12.907 sec(s)
Query # 4 - Truncate + Group By - SELECT TRUNC(DATE,'DAY') DAY FROM performance_10000000 GROUP BY TRUNC(DATE,'DAY');
DAY
-------------------
2013-12-23 00:00:00
2013-12-24 00:00:00
2013-12-25 00:00:00
......
Time: 13.845 sec(s)
Query # 5 - Filter + Count - SELECT COUNT(1) FROM performance_10000000 WHERE CORE<10;
COUNT(1)
--------
393154
Time: 8.522 sec(s)
長處:
1:命令行和javaclient使用都非常easy。尤其是javaclient直接面向JDBC接口編程,封裝且優化了Hbase非常多細節。
2:在單表操作上性能比Hive Handler好非常多(可是handler也有可能會升級增加斜處理器相關聚合等特性)
3:支持多列的二級索引,列數不限。當中可變索引時列數越多寫入速度越慢,不可變索引不影響寫入速度(參考:https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing#mutable-indexing)。
4:對Top-N查詢速度遠超Hive(參考:https://github.com/forcedotcom/phoenix/wiki/Performance#top-n)
5:提供對rowkey分桶的特性,能夠實現數據在各個region的均勻分布(參考:https://github.com/forcedotcom/phoenix/wiki/Performance#salting)
6:低侵入性,基本對原Hbase的使用沒什么影響
7:提供的函數基本都能cover住絕大多數需求了
8:與Hive不同的是,Phoenix的sql語句更接近標准sql規范。
缺點:
1:Phoenix創建的表Hbase能夠識別並使用,可是使用Hbase創建的表,Phoenix不能識別,由於Phoenix對每張表都有其對應的元數據信息。
2:硬傷:多表join操作當前不支持(官方文檔對當前2.2.3版本號的說法不一致,但3.0應該會支持,有可能會引入Apache Drill把大表join切割成小任務的特性)。
3:眼下僅僅支持hbase0.94系列(這個應該問題不大)
其它
1:Phoenix對全部數據都是基於內存進行統計。因此從sql語句復雜程度以及java各數據結構的性能,基本能對其消耗的時間有一個大概的預計。
功能擴展(impala不太熟,主要針對Hive說了)
假設將Hive和Phoenix集成的話,還是非常不錯的,兩者剛好互補。Hive並沒使用coprocesser,僅僅是通過把數據取出來做MR,而Phoenix剛好是在單表取數據方面做了非常多優化。集成后能夠享受到Phoenix的單表操作優勢,同一時候能夠解決多表join的問題(在Phoenix預計短時間難做出來大表join的方案,說是要模仿Drill,可是如今Drill本身救處於Alpha階段)。
假設集成的話主要工作是須要在Hive-hbase-handler中適配Hive相關單表操作到Phoenix的javaclient接口。
不太成熟的問題:
1:是把Phoenix的單表操作移植到Hive中還是把Hive的join移植到Phoenix中?
2:是僅僅對外提供Hive的接口還是同一時候對外提供Hive和Phoenix兩種接口呢?
3:適配的過程還有非常多細節
嘗試調研了下Phoenix二級索引能否夠達到像華為一樣創建完能夠無需改動HBase不論什么代碼就享受到二級索引的效果
擴展閱讀:
sql for hbase(Phoenix、Impala、Drill): http://www.orzota.com/sql-for-hbase/
SQL on Hadoop的最新進展及7項相關技術分享:http://www.csdn.net/article/2013-10-18/2817214-big-data-hadoop
對照華為HBase二級索引:
缺點:華為二級索引須要在建表時指定列(及不支持動態改動),同一時候華為代碼對Hbase本身侵入性太大(比方balancer要用華為的),難以升級維護。
長處:可是索引建好后,在對Hbase的scan、Puts、Deletes操作時使用Hbase原生代碼(無需不論什么修改)就可以享受到索引的效果。也不須要指定使用哪個索引,它會自己使用最優索引。
也就是說假設加上華為索引,Hive-hbase-handler無需修改就可以使用二級索引。可是phoenix眼下僅僅支持通過phoenix sql方式使用二級索引。
性能對照:暫未測試,預計差不太多
綜合看移植phoenix比移植華為更靠譜,phoenix侵入性小,功能更強大,且升級維護方面也比華為要靠譜。可是移植phoenix難度也相對照較大。
可是假設僅僅是想短期起效果,能夠嘗試下華為索引。
淘寶開源項目Lealone:
是一個可用於HBase的分布式SQL引擎,主要功能就是能用SQL方式(JDBC)查詢Hbase,避免了HBase使用的繁瑣操作。相對與Phoenix的功能弱多了。
- 支持高性能的分布式事務,
- 使用一個很新穎的基於局部時間戳的多版本號沖突與有效性檢測的分布式事務模型
- 是對H2關系數據庫SQL引擎的改進和擴展
- HBase建的表Lealone僅僅能讀;Lealone建的表Lealone能夠讀寫。
基於Solr的HBase多條件查詢:
介紹:ApacheSolr 是一個開源的搜索server,Solr 使用 Java 語言開發,主要基於 HTTP 和Apache Lucene 實現。
原理:基於Solr的HBase多條件查詢原理非常easy,將HBase表中涉及條件過濾的字段和rowkey在Solr中建立索引,通過Solr的多條件查詢高速獲得符合過濾條件的rowkey值,拿到這些rowkey之后在HBASE中通過指定rowkey進行查詢。
缺點:
1:ApacheSolr本身並非專為HBase設計的。須要專門針對ApacheSolr寫Hbase的相關應用,比方HBase寫數據時同步更新索引的過程須要我們自己寫協處理器。
2:ApacheSolr本身是一個WebService服務,須要額外維護一個或多個ApacheSolrserver。
參考:
1:基於Solr的HBase多條件查詢http://www.cnblogs.com/chenz/articles/3229997.html
2:http://blog.csdn.net/chenjia3615349/article/details/8112289#t40
中文wiki:https://github.com/codefollower/Lealone
個人感覺Phoenix是這些sql for hbase項目里最合適、最有前景一個。