python 通過thrift 簡單操作hbase


 

thrift 是facebook開發並開源的一個二進制通訊中間件,通過thrift,我們可以充分利用各個語言的優勢,編寫高效的代碼。

關於thrift的論文:http://pan.baidu.com/share/link?shareid=234128&uk=3238841275

安裝thrift:http://thrift.apache.org/docs/install/ubuntu/

安裝完成后到hbase的目錄下,找到Hbase.thrift,該文件在

hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到

thrift --gen py hbase.thrift 會生成gen-py文件夾,將其修改成hbase

安裝python的thrift庫

sudo pip install thrift

啟動hbase的thrift服務:bin/hbase-daemon.sh start thrift 默認端口是9090

創建hbase表:

 1 from thrift import Thrift
 2 from thrift.transport import TSocket
 3 from thrift.transport import TTransport
 4 from thrift.protocol import TBinaryProtocol
 5 
 6 from hbase import Hbase
 7 from hbase.ttypes import *
 8 
 9 transport = TSocket.TSocket('localhost', 9090);
10 
11 transport = TTransport.TBufferedTransport(transport)
12 
13 protocol = TBinaryProtocol.TBinaryProtocol(transport);
14 
15 client = Hbase.Client(protocol)
16 transport.open()
17 
18 
19 contents = ColumnDescriptor(name='cf:', maxVersions=1)
20 client.createTable('test', [contents])
21 
22 print client.getTableNames()

執行代碼,成功后,進入hbase的shell,用命令list可以看到剛剛的test表已經創建成功。

插入數據:

 1 from thrift import Thrift
 2 from thrift.transport import TSocket
 3 from thrift.transport import TTransport
 4 from thrift.protocol import TBinaryProtocol
 5 
 6 from hbase import Hbase
 7 
 8 from hbase.ttypes import *
 9 
10 transport = TSocket.TSocket('localhost', 9090)
11 
12 transport = TTransport.TBufferedTransport(transport)
13 
14 protocol = TBinaryProtocol.TBinaryProtocol(transport)
15 
16 client = Hbase.Client(protocol)
17 
18 transport.open()
19 
20 row = 'row-key1'
21 
22 mutations = [Mutation(column="cf:a", value="1")]
23 client.mutateRow('test', row, mutations, None)

插入成功,通過scan命令查看插入結果:

獲取一行數據:

 

 1 from thrift import Thrift
 2 from thrift.transport import TSocket
 3 from thrift.transport import TTransport
 4 from thrift.protocol import TBinaryProtocol
 5 
 6 from hbase import Hbase
 7 from hbase.ttypes import *
 8 
 9 transport = TSocket.TSocket('localhost', 9090)
10 transport = TTransport.TBufferedTransport(transport)
11 
12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
13 
14 client = Hbase.Client(protocol)
15 
16 transport.open()
17 
18 tableName = 'test'
19 rowKey = 'row-key1'
20 
21 result = client.getRow(tableName, rowKey, None)
22 print result
23 for r in result:
24     print 'the row is ' , r.row
25     print 'the values is ' , r.columns.get('cf:a').value

 

getRow返回的是TResult列表,結果如下:

返回多行則需要使用scan:

 1 from thrift import Thrift
 2 from thrift.transport import TSocket
 3 from thrift.transport import TTransport
 4 from thrift.protocol import TBinaryProtocol
 5 
 6 from hbase import Hbase
 7 from hbase.ttypes import *
 8 
 9 transport = TSocket.TSocket('localhost', 9090)
10 transport = TTransport.TBufferedTransport(transport)
11 
12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
13 
14 client = Hbase.Client(protocol)
15 transport.open()
16 
17 scan = TScan()
18 tableName = 'test'
19 id = client.scannerOpenWithScan(tableName, scan, None)
20 
21 result2 = client.scannerGetList(id, 10)
22 
23 print result2

scannerGetList會取10條數據,然后輸出結果

 scannerGet則是每次只取一行數據:

 1 from thrift import Thrift
 2 from thrift.transport import TSocket
 3 from thrift.transport import TTransport
 4 from thrift.protocol import TBinaryProtocol
 5 
 6 from hbase import Hbase
 7 from hbase.ttypes import *
 8 
 9 transport = TSocket.TSocket('localhost', 9090)
10 transport = TTransport.TBufferedTransport(transport)
11 
12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
13 
14 client = Hbase.Client(protocol)
15 transport.open()
16 
17 scan = TScan()
18 tableName = 'test'
19 id = client.scannerOpenWithScan(tableName, scan, None)
20 result = client.scannerGet(id)
21 while result:
22     print result
23     result = client.scannerGet(id)

輸出結果:


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM