thrift 是facebook開發並開源的一個二進制通訊中間件,通過thrift,我們可以充分利用各個語言的優勢,編寫高效的代碼。
關於thrift的論文:http://pan.baidu.com/share/link?shareid=234128&uk=3238841275
安裝thrift:http://thrift.apache.org/docs/install/ubuntu/
安裝完成后到hbase的目錄下,找到Hbase.thrift,該文件在
hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到
thrift --gen py hbase.thrift 會生成gen-py文件夾,將其修改成hbase
安裝python的thrift庫
sudo pip install thrift
啟動hbase的thrift服務:bin/hbase-daemon.sh start thrift 默認端口是9090
創建hbase表:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket('localhost', 9090); 10 11 transport = TTransport.TBufferedTransport(transport) 12 13 protocol = TBinaryProtocol.TBinaryProtocol(transport); 14 15 client = Hbase.Client(protocol) 16 transport.open() 17 18 19 contents = ColumnDescriptor(name='cf:', maxVersions=1) 20 client.createTable('test', [contents]) 21 22 print client.getTableNames()
執行代碼,成功后,進入hbase的shell,用命令list可以看到剛剛的test表已經創建成功。
插入數據:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 8 from hbase.ttypes import * 9 10 transport = TSocket.TSocket('localhost', 9090) 11 12 transport = TTransport.TBufferedTransport(transport) 13 14 protocol = TBinaryProtocol.TBinaryProtocol(transport) 15 16 client = Hbase.Client(protocol) 17 18 transport.open() 19 20 row = 'row-key1' 21 22 mutations = [Mutation(column="cf:a", value="1")] 23 client.mutateRow('test', row, mutations, None)
插入成功,通過scan命令查看插入結果:
獲取一行數據:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket('localhost', 9090) 10 transport = TTransport.TBufferedTransport(transport) 11 12 protocol = TBinaryProtocol.TBinaryProtocol(transport) 13 14 client = Hbase.Client(protocol) 15 16 transport.open() 17 18 tableName = 'test' 19 rowKey = 'row-key1' 20 21 result = client.getRow(tableName, rowKey, None) 22 print result 23 for r in result: 24 print 'the row is ' , r.row 25 print 'the values is ' , r.columns.get('cf:a').value
getRow返回的是TResult列表,結果如下:
返回多行則需要使用scan:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket('localhost', 9090) 10 transport = TTransport.TBufferedTransport(transport) 11 12 protocol = TBinaryProtocol.TBinaryProtocol(transport) 13 14 client = Hbase.Client(protocol) 15 transport.open() 16 17 scan = TScan() 18 tableName = 'test' 19 id = client.scannerOpenWithScan(tableName, scan, None) 20 21 result2 = client.scannerGetList(id, 10) 22 23 print result2
scannerGetList會取10條數據,然后輸出結果
scannerGet則是每次只取一行數據:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket('localhost', 9090) 10 transport = TTransport.TBufferedTransport(transport) 11 12 protocol = TBinaryProtocol.TBinaryProtocol(transport) 13 14 client = Hbase.Client(protocol) 15 transport.open() 16 17 scan = TScan() 18 tableName = 'test' 19 id = client.scannerOpenWithScan(tableName, scan, None) 20 result = client.scannerGet(id) 21 while result: 22 print result 23 result = client.scannerGet(id)
輸出結果: