Hbase概念:
常用的oracle、mySQL數據庫都是面向行儲存的,而hbase是面向列儲存的數據庫,儲存本身具有水平延展性。
hbase有兩個主要概念:Row key和Column Famliy
Column Famliy又稱為“列族”,每一個Column Family都可以根據“限定符”有多個column。
例如:有一個User表,傳統的數據庫中,表的列是固定的,name,age,sex等屬性,User的屬性不能動態增加。
但采用列儲存系統,比如Hbase,那么我們可以定義User表,然后定義info 列族,User的數據可以分為:info:name = zhangsan,info:age=30,info:sex=male等,
如果后來你又想增加另外的屬性,這樣很方便只需要info:newProperty就可以了。
又比如儲存用戶信息,信息並不一定完整,而null的單元會造成空間浪費,在Hbase里,如果每一個column 單元沒有值,那么是不占用空間的。
采用Hbase的這種方式,還有一個非常重要的好處就是會自動切分,當表中的數據超過某一個閥值以后,Hbase會自動為我們切分數據,這樣的話,查詢就具有了伸縮性,而再加上Hbase的弱事務性的特性,對Hbase的寫入操作也將變得非常快。
另一個主要概念:Row key,其實可以理解row key 是某一行的主鍵,但是因為Hbase不支持條件查詢以及Order by等查詢,因此Row key的設計就要根據你系統的查詢需求來設計了。
例如我們查詢某人信息,因此我們的Row key可以有以下三個部分構成<userId><timestamp><feedId>,這樣以來當我們要查詢某個人的最進的Feed就可以指定Start Rowkey為<userId><0><0>,End Rowkey為<userId><Long.MAX_VALUE><Long.MAX_VALUE>來查詢了,同時因為Hbase中的記錄是按照rowkey來排序的,這樣就使得查詢變得非常快。
Hbase的優缺點
1 列的可以動態增加,並且列為空就不存儲數據,節省存儲空間.
2 Hbase自動切分數據,使得數據存儲自動具有水平可擴展性
3 Hbase可以提供高並發讀寫操作的支持
Hbase的缺點:
1 不能支持條件查詢,只支持按照Row key來查詢.
2 暫時不能支持Master server的故障切換,當Master宕機后,整個存儲系統就會掛掉.
補充:
1.數據類型,HBase只有簡單的字符類型,所有的類型都是交由用戶自己處理,它只保存字符串。而關系數據庫有豐富的類型和存儲方式。
2.數據操作:HBase只有很簡單的插入、查詢、刪除、清空等操作,表和表之間是分離的,沒有復雜的表和表之間的關系,而傳統數據庫通常有各式各樣的函數和連接操作。
3.存儲模式:HBase是基於列存儲的,每個列族都由幾個文件保存,不同的列族的文件時分離的。而傳統的關系型數據庫是基於表格結構和行模式保存的
4.數據維護,HBase的更新操作不應該叫更新,它實際上是插入了新的數據,而傳統數據庫是替換修改
5.可伸縮性,Hbase這類分布式數據庫就是為了這個目的而開發出來的,所以它能夠輕松增加或減少硬件的數量,並且對錯誤的兼容性比較高。而傳統數據庫通常需要增加中間層才能實現類似的功能
只要Row key相同就可以看作一條數據,該數據的列,也就是屬性可以添加。
基本使用
import java.io.IOException; import java.util.ArrayList; import java.util.List; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.KeyValue; import org.apache.hadoop.hbase.MasterNotRunningException; import org.apache.hadoop.hbase.ZooKeeperConnectionException; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.HTablePool; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.ResultScanner; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.filter.Filter; import org.apache.hadoop.hbase.filter.FilterList; import org.apache.hadoop.hbase.filter.SingleColumnValueFilter; import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp; import org.apache.hadoop.hbase.util.Bytes; public class HbaseTest { public static Configuration configuration; static { configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.property.clientPort", "2181"); configuration.set("hbase.zookeeper.quorum", "192.168.1.100"); configuration.set("hbase.master", "192.168.1.100:600000"); } public static void main(String[] args) { // createTable("wujintao"); // insertData("wujintao"); // QueryAll("wujintao"); // QueryByCondition1("wujintao"); // QueryByCondition2("wujintao"); //QueryByCondition3("wujintao"); //deleteRow("wujintao","abcdef"); deleteByCondition("wujintao","abcdef"); } public static void createTable(String tableName) { System.out.println("start create table ......"); try { HBaseAdmin hBaseAdmin = new HBaseAdmin(configuration); if (hBaseAdmin.tableExists(tableName)) {// 如果存在要創建的表,那么先刪除,再創建 hBaseAdmin.disableTable(tableName); hBaseAdmin.deleteTable(tableName); System.out.println(tableName + " is exist,detele...."); } HTableDescriptor tableDescriptor = new HTableDescriptor(tableName); tableDescriptor.addFamily(new HColumnDescriptor("column1")); tableDescriptor.addFamily(new HColumnDescriptor("column2")); tableDescriptor.addFamily(new HColumnDescriptor("column3")); hBaseAdmin.createTable(tableDescriptor); } catch (MasterNotRunningException e) { e.printStackTrace(); } catch (ZooKeeperConnectionException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } System.out.println("end create table ......"); } public static void insertData(String tableName) { System.out.println("start insert data ......"); HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Put put = new Put("112233bbbcccc".getBytes());// 一個PUT代表一行數據,再NEW一個PUT表示第二行數據,每行一個唯一的ROWKEY,此處rowkey為put構造方法中傳入的值 put.add("column1".getBytes(), null, "aaa".getBytes());// 本行數據的第一列 put.add("column2".getBytes(), null, "bbb".getBytes());// 本行數據的第三列 put.add("column3".getBytes(), null, "ccc".getBytes());// 本行數據的第三列 try { table.put(put); } catch (IOException e) { e.printStackTrace(); } System.out.println("end insert data ......"); } public static void dropTable(String tableName) { try { HBaseAdmin admin = new HBaseAdmin(configuration); admin.disableTable(tableName); admin.deleteTable(tableName); } catch (MasterNotRunningException e) { e.printStackTrace(); } catch (ZooKeeperConnectionException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } public static void deleteRow(String tablename, String rowkey) { try { HTable table = new HTable(configuration, tablename); List list = new ArrayList(); Delete d1 = new Delete(rowkey.getBytes()); list.add(d1); table.delete(list); System.out.println("刪除行成功!"); } catch (IOException e) { e.printStackTrace(); } } public static void deleteByCondition(String tablename, String rowkey) { //目前還沒有發現有效的API能夠實現根據非rowkey的條件刪除這個功能能,還有清空表全部數據的API操作 } public static void QueryAll(String tableName) { HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); try { ResultScanner rs = table.getScanner(new Scan()); for (Result r : rs) { System.out.println("獲得到rowkey:" + new String(r.getRow())); for (KeyValue keyValue : r.raw()) { System.out.println("列:" + new String(keyValue.getFamily()) + "====值:" + new String(keyValue.getValue())); } } } catch (IOException e) { e.printStackTrace(); } } public static void QueryByCondition1(String tableName) { HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); try { Get scan = new Get("abcdef".getBytes());// 根據rowkey查詢 Result r = table.get(scan); System.out.println("獲得到rowkey:" + new String(r.getRow())); for (KeyValue keyValue : r.raw()) { System.out.println("列:" + new String(keyValue.getFamily()) + "====值:" + new String(keyValue.getValue())); } } catch (IOException e) { e.printStackTrace(); } } public static void QueryByCondition2(String tableName) { try { HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Filter filter = new SingleColumnValueFilter(Bytes .toBytes("column1"), null, CompareOp.EQUAL, Bytes .toBytes("aaa")); // 當列column1的值為aaa時進行查詢 Scan s = new Scan(); s.setFilter(filter); ResultScanner rs = table.getScanner(s); for (Result r : rs) { System.out.println("獲得到rowkey:" + new String(r.getRow())); for (KeyValue keyValue : r.raw()) { System.out.println("列:" + new String(keyValue.getFamily()) + "====值:" + new String(keyValue.getValue())); } } } catch (Exception e) { e.printStackTrace(); } } public static void QueryByCondition3(String tableName) { try { HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); List<Filter> filters = new ArrayList<Filter>(); Filter filter1 = new SingleColumnValueFilter(Bytes .toBytes("column1"), null, CompareOp.EQUAL, Bytes .toBytes("aaa")); filters.add(filter1); Filter filter2 = new SingleColumnValueFilter(Bytes .toBytes("column2"), null, CompareOp.EQUAL, Bytes .toBytes("bbb")); filters.add(filter2); Filter filter3 = new SingleColumnValueFilter(Bytes .toBytes("column3"), null, CompareOp.EQUAL, Bytes .toBytes("ccc")); filters.add(filter3); FilterList filterList1 = new FilterList(filters); Scan scan = new Scan(); scan.setFilter(filterList1); ResultScanner rs = table.getScanner(scan); for (Result r : rs) { System.out.println("獲得到rowkey:" + new String(r.getRow())); for (KeyValue keyValue : r.raw()) { System.out.println("列:" + new String(keyValue.getFamily()) + "====值:" + new String(keyValue.getValue())); } } rs.close(); } catch (Exception e) { e.printStackTrace(); } } }
數據獲取實例:
/* * Need Packages: * commons-codec-1.4.jar * * commons-logging-1.1.1.jar * * hadoop-0.20.2-core.jar * * hbase-0.90.2.jar * * log4j-1.2.16.jar * * zookeeper-3.3.2.jar * */ import java.io.IOException; import java.util.ArrayList; import java.util.List; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.KeyValue; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.ResultScanner; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp; import org.apache.hadoop.hbase.filter.FilterList; import org.apache.hadoop.hbase.filter.SingleColumnValueFilter; import org.apache.hadoop.hbase.util.Bytes; public class HbaseSelecter { public static Configuration configuration = null; static { configuration = HBaseConfiguration.create(); //configuration.set("hbase.master", "192.168.0.201:60000"); configuration.set("hbase.zookeeper.quorum", "idc01-hd-nd-03,idc01-hd-nd-04,idc01-hd-nd-05"); //configuration.set("hbase.zookeeper.property.clientPort", "2181"); } public static void selectRowKey(String tablename, String rowKey) throws IOException { HTable table = new HTable(configuration, tablename); Get g = new Get(rowKey.getBytes()); Result rs = table.get(g); for (KeyValue kv : rs.raw()) { System.out.println("--------------------" + new String(kv.getRow()) + "----------------------------"); System.out.println("Column Family: " + new String(kv.getFamily())); System.out.println("Column :" + new String(kv.getQualifier())); System.out.println("value : " + new String(kv.getValue())); } } public static void selectRowKeyFamily(String tablename, String rowKey, String family) throws IOException { HTable table = new HTable(configuration, tablename); Get g = new Get(rowKey.getBytes()); g.addFamily(Bytes.toBytes(family)); Result rs = table.get(g); for (KeyValue kv : rs.raw()) { System.out.println("--------------------" + new String(kv.getRow()) + "----------------------------"); System.out.println("Column Family: " + new String(kv.getFamily())); System.out.println("Column :" + new String(kv.getQualifier())); System.out.println("value : " + new String(kv.getValue())); } } public static void selectRowKeyFamilyColumn(String tablename, String rowKey, String family, String column) throws IOException { HTable table = new HTable(configuration, tablename); Get g = new Get(rowKey.getBytes()); g.addColumn(family.getBytes(), column.getBytes()); Result rs = table.get(g); for (KeyValue kv : rs.raw()) { System.out.println("--------------------" + new String(kv.getRow()) + "----------------------------"); System.out.println("Column Family: " + new String(kv.getFamily())); System.out.println("Column :" + new String(kv.getQualifier())); System.out.println("value : " + new String(kv.getValue())); } } public static void selectFilter(String tablename, List<String> arr) throws IOException { HTable table = new HTable(configuration, tablename); Scan scan = new Scan();// 實例化一個遍歷器 FilterList filterList = new FilterList(); // 過濾器List for (String v : arr) { // 下標0為列簇,1為列名,3為條件 String[] wheres = v.split(","); filterList.addFilter(new SingleColumnValueFilter(// 過濾器 wheres[0].getBytes(), wheres[1].getBytes(), CompareOp.EQUAL,// 各個條件之間是" and "的關系 wheres[2].getBytes())); } scan.setFilter(filterList); ResultScanner ResultScannerFilterList = table.getScanner(scan); for (Result rs = ResultScannerFilterList.next(); rs != null; rs = ResultScannerFilterList.next()) { for (KeyValue kv : rs.list()) { System.out.println("--------------------" + new String(kv.getRow()) + "----------------------------"); System.out.println("Column Family: " + new String(kv.getFamily())); System.out.println("Column :" + new String(kv.getQualifier())); System.out.println("value : " + new String(kv.getValue())); } } } public static void main(String[] args) throws Exception { if(args.length < 2){ System.out.println("Usage: HbaseSelecter table key"); System.exit(-1); } System.out.println("Table: " + args[0] + " , key: " + args[1]); selectRowKey(args[0], args[1]); /* System.out.println("------------------------行鍵 查詢----------------------------------"); selectRowKey("b2c", "yihaodian1002865"); selectRowKey("b2c", "yihaodian1003396"); System.out.println("------------------------行鍵+列簇 查詢----------------------------------"); selectRowKeyFamily("riapguh", "用戶A", "user"); selectRowKeyFamily("riapguh", "用戶B", "user"); System.out.println("------------------------行鍵+列簇+列名 查詢----------------------------------"); selectRowKeyFamilyColumn("riapguh", "用戶A", "user", "user_code"); selectRowKeyFamilyColumn("riapguh", "用戶B", "user", "user_code"); System.out.println("------------------------條件 查詢----------------------------------"); List<String> arr = new ArrayList<String>(); arr.add("dpt,dpt_code,d_001"); arr.add("user,user_code,u_0001"); selectFilter("riapguh", arr); */ } }
摘自:https://www.cnblogs.com/zhenjing/p/hbase_example.html