本博文的主要內容有
.HBase的單機模式(1節點)安裝
.HBase的單機模式(1節點)的啟動
.HBase的偽分布模式(1節點)安裝
.HBase的偽分布模式(1節點)的啟動
.HBase的分布模式(3、5節點)安裝
.HBase的分布模式(3、5節點)的啟動
見博客: HBase HA的分布式集群部署
.HBase環境搭建60010端口無法訪問問題解決方案
------------- 注意 HBase1.X版本之后,沒60010了。 -------------
參考:http://blog.csdn.net/tian_li/article/details/50601210
.進入HBase Shell
.為什么在HBase,需要使用zookeeper?
.關於HBase的更多技術細節,強烈必多看
.獲取命令列表:help幫助命令
.創建表:create命令
.向表中加入行:put命令
.從表中檢索行:get命令
.讀取多行:scan命令
.統計表中的行數:count命令
.刪除行:delete命令
.清空表:truncate命令
.刪除表:drop命令
.更換表 :alter命令
想說的是,
HBase的安裝包里面有自帶zookeeper的。很多系統部署也是直接啟動上面的zookeeper。 本來也是沒有問題的,想想吧,系統里也只有hbase在用zookeeper。
先啟動zookeeper,再將hbase起來就好了 。
但是今天遇到了一個很蛋疼的問題。和同事爭論了很久。 因為我們是好多hbase集群共用一個zookeeper的,其中一個集群需要從hbase 0.90.2 升級到hbase 0.92上,自然,包也要更新。
但是其中一台regionserver上面同時也有跑zookeeper,而zookeeper還是用hbase 0.90.2 自帶的zookeeper在跑。
現在好了,升級一個regionserver,連着zookeeper也要受到牽連,看來必須要重啟,不然,jar包替換掉,可能會影響到zk正在跑的經常。
但是重啟zk畢竟對正在連接這個zk的client端會有短暫的影響。
真是蛋疼。本來只是升級hbase,zk卻強耦合了。
雖然后來證明zookeeper只要啟動了,哪怕jar包刪除也不會影響到正在跑的zk進程,但是這樣的不規范帶來的風險,實在是沒有必要。
所以作為運維,我強烈建議zk 和hbase分開部署,就直接部署官方的zk 好了,因為zk本身就是一個獨立的服務,沒有必要和hbase 耦合在一起。
在分布式的系統部署上面,一個角色就用一個專門的文件夾管理,不要用同一個目錄下,這樣子真的容易出問題。
當然datanode和tasktracker另當別論,他們本身關系密切。
當然,這里,我是玩的單節點的集群,來安裝HBase而已,只是來玩玩。所以,完全,只需用HBase的安裝包里自帶的zookeeper就好了。
除非,是多節點的分布式集群,最好用外部的zookeeper。
HDFS的版本,不同,HBase里的內部也不一樣。
.HBase的單機模式安裝
[hadoop@weekend110 app]$ ls
hadoop-2.4.1 hbase-0.96.2-hadoop2 hive-0.12.0 jdk1.7.0_65
[hadoop@weekend110 app]$ cd hbase-0.96.2-hadoop2/
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ ls
bin CHANGES.txt conf docs hbase-webapps lib LICENSE.txt logs NOTICE.txt README.txt
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ cd conf/
[hadoop@weekend110 conf]$ ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[hadoop@weekend110 conf]$ vim hbase-env.sh
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=true
設HBASE_MANAGES_ZK=true,在啟動HBase時,HBase把Zookeeper作為自身的一部分運行。
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65
[hadoop@weekend110 conf]$ ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[hadoop@weekend110 conf]$ vim hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///tmp/hbase-hadoop/hbase</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
在這里,有些資料上說,file:///tmp/hbase-${user.name}/hbase
可以看到,默認情況下HBase的數據存儲在根目錄下的tmp文件夾下的。熟悉Linux的人知道,此文件夾為臨時文件夾。也就是說,當系統重啟的時候,此文件夾中的內容將被清空。這樣用戶保存在HBase中的數據也會丟失,這當然是用戶不想看到的事情。因此,用戶需要將HBase數據的存儲位置修改為自己希望的存儲位置。
比如,可以,/home/hadoop/data/hbase,當然,我這里,是因為,偽分布模式和分布式模式,都玩過了。方便,練習加強HBase的shell操作。而已,拿單機模式玩玩。
.HBase的單機模式的啟動
總結就是:先啟動hadoop集群的進程,再啟動hbase的進程
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ cd bin
[hadoop@weekend110 bin]$ ls
get-active-master.rb hbase-common.sh hbase-jruby region_mover.rb start-hbase.cmd thread-pool.rb
graceful_stop.sh hbase-config.cmd hirb.rb regionservers.sh start-hbase.sh zookeepers.sh
hbase hbase-config.sh local-master-backup.sh region_status.rb stop-hbase.cmd
hbase-cleanup.sh hbase-daemon.sh local-regionservers.sh replication stop-hbase.sh
hbase.cmd hbase-daemons.sh master-backup.sh rolling-restart.sh test
[hadoop@weekend110 bin]$ jps
2443 NameNode
2970 NodeManager
2539 DataNode
2729 SecondaryNameNode
2866 ResourceManager
4634 Jps
[hadoop@weekend110 bin]$ ./start-hbase.sh
starting master, logging to /home/hadoop/app/hbase-0.96.2-hadoop2/logs/hbase-hadoop-master-weekend110.out
[hadoop@weekend110 bin]$ jps
2443 NameNode
2970 NodeManager
2539 DataNode
2729 SecondaryNameNode
2866 ResourceManager
4740 HMaster
4819 Jps
[hadoop@weekend110 bin]$ hbase shell
2016-10-12 12:43:11,095 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014
hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/app/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/app/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
0 row(s) in 3.8200 seconds
=> []
hbase(main):002:0> create 'mygirls', {NAME => 'base_info',VERSION => 3},{NAME => 'extra_info'}
Unknown argument ignored for column family base_info: 1.8.7
0 row(s) in 1.1560 seconds
=> Hbase::Table - mygirls
hbase(main):003:0>
測試
.HBase的偽分布模式(1節點)安裝
1、 hbase-0.96.2-hadoop2-bin.tar.gz壓縮包的上傳
sftp> cd /home/hadoop/app
sftp> put c:/hbase-0.96.2-hadoop2-bin.tar.gz
Uploading hbase-0.96.2-hadoop2-bin.tar.gz to /home/hadoop/app/hbase-0.96.2-hadoop2-bin.tar.gz
100% 77507KB 19376KB/s 00:00:04
c:/hbase-0.96.2-hadoop2-bin.tar.gz: 79367504 bytes transferred in 4 seconds (19376 KB/s)
sftp>
或者,通過
這里不多贅述。具體,可以看我的其他博客
2、 hbase-0.96.2-hadoop2-bin.tar.gz壓縮包的解壓
[hadoop@weekend110 app]$ ls
hadoop-2.4.1 hbase-0.96.2-hadoop2-bin.tar.gz hive-0.12.0 jdk1.7.0_65 zookeeper-3.4.6
[hadoop@weekend110 app]$ ll
total 77524
drwxr-xr-x. 11 hadoop hadoop 4096 Jul 18 20:11 hadoop-2.4.1
-rw-r--r--. 1 root root 79367504 May 20 13:51 hbase-0.96.2-hadoop2-bin.tar.gz
drwxrwxr-x. 10 hadoop hadoop 4096 Oct 10 21:30 hive-0.12.0
drwxr-xr-x. 8 hadoop hadoop 4096 Jun 17 2014 jdk1.7.0_65
drwxr-xr-x. 10 hadoop hadoop 4096 Jul 30 10:28 zookeeper-3.4.6
[hadoop@weekend110 app]$ tar -zxvf hbase-0.96.2-hadoop2-bin.tar.gz
3、刪除壓縮包hbase-0.96.2-hadoop2-bin.tar.gz
4、將HBase文件權限賦予給hadoop用戶,這一步,不需。
5、HBase的配置
注意啦,在hbase-0.96.2-hadoop2的目錄下,有hbase-webapps,即,說明,可以通過web網頁來訪問HBase。
[hadoop@weekend110 app]$ ls
hadoop-2.4.1 hbase-0.96.2-hadoop2 hive-0.12.0 jdk1.7.0_65 zookeeper-3.4.6
[hadoop@weekend110 app]$ cd hbase-0.96.2-hadoop2/
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ ll
total 436
drwxr-xr-x. 4 hadoop hadoop 4096 Mar 25 2014 bin
-rw-r--r--. 1 hadoop hadoop 403242 Mar 25 2014 CHANGES.txt
drwxr-xr-x. 2 hadoop hadoop 4096 Mar 25 2014 conf
drwxr-xr-x. 27 hadoop hadoop 4096 Mar 25 2014 docs
drwxr-xr-x. 7 hadoop hadoop 4096 Mar 25 2014 hbase-webapps
drwxrwxr-x. 3 hadoop hadoop 4096 Oct 11 17:49 lib
-rw-r--r--. 1 hadoop hadoop 11358 Mar 25 2014 LICENSE.txt
-rw-r--r--. 1 hadoop hadoop 897 Mar 25 2014 NOTICE.txt
-rw-r--r--. 1 hadoop hadoop 1377 Mar 25 2014 README.txt
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ cd conf/
[hadoop@weekend110 conf]$ ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[hadoop@weekend110 conf]$
對於,多節點里,安裝HBase,這里不多說了。具體,可以看我的博客
1.上傳hbase安裝包
2.解壓
3.配置hbase集群,要修改3個文件(首先zk集群已經安裝好了)
注意:要把hadoop的hdfs-site.xml和core-site.xml 放到hbase/conf下
3.1修改hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_55
//告訴hbase使用外部的zk
export HBASE_MANAGES_ZK=false
vim hbase-site.xml
<configuration>
<!-- 指定hbase在HDFS上存儲的路徑 -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://ns1/hbase</value>
</property>
<!-- 指定hbase是分布式的 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- 指定zk的地址,多個用“,”分割 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>weekend04:2181,weekend05:2181,weekend06:2181</value>
</property>
</configuration>
vim regionservers
weekend03
weekend04
weekend05
weekend06
3.2拷貝hbase到其他節點
scp -r /weekend/hbase-0.96.2-hadoop2/ weekend02:/weekend/
scp -r /weekend/hbase-0.96.2-hadoop2/ weekend03:/weekend/
scp -r /weekend/hbase-0.96.2-hadoop2/ weekend04:/weekend/
scp -r /weekend/hbase-0.96.2-hadoop2/ weekend05:/weekend/
scp -r /weekend/hbase-0.96.2-hadoop2/ weekend06:/weekend/
4.將配置好的HBase拷貝到每一個節點並同步時間。
5.啟動所有的hbase
分別啟動zk
./zkServer.sh start
啟動hbase集群
start-dfs.sh
啟動hbase,在主節點上運行:
start-hbase.sh
6.通過瀏覽器訪問hbase管理頁面
192.168.1.201:60010
7.為保證集群的可靠性,要啟動多個HMaster
hbase-daemon.sh start master
我這里,因,考慮到自己玩玩,偽分布集群里安裝HBase。
hbase-env.sh
[hadoop@weekend110 conf]$ ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[hadoop@weekend110 conf]$ vim hbase-env.sh
/home/hadoop/app/jdk1.7.0_65
單節點的hbase-env.sh,需要修改2處。
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65
export HBASE_MANAGES_ZK=false
.為什么在HBase,需要使用zookeeper?
大家,很多人,都有一個疑問,為什么在HBase,需要使用zookeeper?至於為什么最好使用外部安裝的zookeeper,而不是HBase自帶的zookeeper,這里,我實在是不多贅述了。
zookeeper存儲的是HBase中ROOT表和META表的位置。此外,zookeeper還負責監控多個機器的狀態(每台機器到zookeeper中注冊一個實例)。當某台機器發生故障時
,zookeeper會第一時間感知到,並通知HBase Master進行相應的處理。同時,當HBase Master發生故障的時候,zookeeper還負責HBase Master的恢復工作,能夠保證還在同一時刻系統中只有一台HBase Master提供服務。
具體例子,見
HBase HA的分布式集群部署 的最低端。
hbase-site.xml
[hadoop@weekend110 conf]$ ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[hadoop@weekend110 conf]$ vim hbase-site.xml
<configuration>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/data/zookeeper/zkdata</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/home/hadoop/data/tmp/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://weekend110:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
新建目錄
/home/hadoop/data/zookeeper/zkdata
/home/hadoop/data/tmp/hbase
[hadoop@weekend110 conf]$ pwd
/home/hadoop/app/hbase-0.96.2-hadoop2/conf
[hadoop@weekend110 conf]$ mkdir -p /home/hadoop/data/zookeeper/zkdata
[hadoop@weekend110 conf]$ mkdir -p /home/hadoop/data/tmp/hbase
[hadoop@weekend110 conf]$
regionservers
weekend110
[hadoop@weekend110 conf]$ ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[hadoop@weekend110 conf]$ cp /home/hadoop/app/hadoop-2.4.1/etc/hadoop/{core-site.xml,hdfs-site.xml} ./
[hadoop@weekend110 conf]$ ls
core-site.xml hbase-env.cmd hbase-policy.xml hdfs-site.xml regionservers
hadoop-metrics2-hbase.properties hbase-env.sh hbase-site.xml log4j.properties
[hadoop@weekend110 conf]$
vi /etc/profile
[hadoop@weekend110 conf]$ su root
Password:
[root@weekend110 conf]# vim /etc/profile
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65
export HADOOP_HOME=/home/hadoop/app/hadoop-2.4.1
export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper-3.4.6
export HIVE_HOME=/home/hadoop/app/hive-0.12.0
export HBASE_HOME=/home/hadoop/app/hbase-0.96.2-hadoop2
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin
[root@weekend110 conf]# source /etc/profile
[root@weekend110 conf]# su hadoop
.HBase的偽分布模式的啟動
由於偽分布模式的運行基於HDFS,因此在運行HBase之前首先需要啟動HDFS,
[hadoop@weekend110 hadoop-2.4.1]$ jps
5802 Jps
[hadoop@weekend110 hadoop-2.4.1]$ sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [weekend110]
weekend110: starting namenode, logging to /home/hadoop/app/hadoop-2.4.1/logs/hadoop-hadoop-namenode-weekend110.out
weekend110: starting datanode, logging to /home/hadoop/app/hadoop-2.4.1/logs/hadoop-hadoop-datanode-weekend110.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-weekend110.out
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.4.1/logs/yarn-hadoop-resourcemanager-weekend110.out
weekend110: starting nodemanager, logging to /home/hadoop/app/hadoop-2.4.1/logs/yarn-hadoop-nodemanager-weekend110.out
[hadoop@weekend110 hadoop-2.4.1]$ jps
6022 DataNode
6149 SecondaryNameNode
5928 NameNode
6287 ResourceManager
6426 Jps
6387 NodeManager
[hadoop@weekend110 hadoop-2.4.1]$
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ pwd
/home/hadoop/app/hbase-0.96.2-hadoop2
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ ls
bin CHANGES.txt conf docs hbase-webapps lib LICENSE.txt NOTICE.txt README.txt
[hadoop@weekend110 hbase-0.96.2-hadoop2]$ cd bin
[hadoop@weekend110 bin]$ ls
get-active-master.rb hbase-common.sh hbase-jruby region_mover.rb start-hbase.cmd thread-pool.rb
graceful_stop.sh hbase-config.cmd hirb.rb regionservers.sh start-hbase.sh zookeepers.sh
hbase hbase-config.sh local-master-backup.sh region_status.rb stop-hbase.cmd
hbase-cleanup.sh hbase-daemon.sh local-regionservers.sh replication stop-hbase.sh
hbase.cmd hbase-daemons.sh master-backup.sh rolling-restart.sh test
[hadoop@weekend110 bin]$ ./start-hbase.sh
starting master, logging to /home/hadoop/app/hbase-0.96.2-hadoop2/logs/hbase-hadoop-master-weekend110.out
[hadoop@weekend110 bin]$ jps
6022 DataNode
6149 SecondaryNameNode
5928 NameNode
6707 Jps
6287 ResourceManager
6530 HMaster
6387 NodeManager
[hadoop@weekend110 bin]$
參考博客:http://blog.csdn.net/u013575812/article/details/46919011
[hadoop@weekend110 bin]$ pwd
/home/hadoop/app/hbase-0.96.2-hadoop2/bin
[hadoop@weekend110 bin]$ hadoop dfsadmin -safemode leave
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Safe mode is OFF
[hadoop@weekend110 bin]$ jps
6022 DataNode
7135 Jps
6149 SecondaryNameNode
5928 NameNode
6287 ResourceManager
6387 NodeManager
[hadoop@weekend110 bin]$ ./start-hbase.sh
starting master, logging to /home/hadoop/app/hbase-0.96.2-hadoop2/logs/hbase-hadoop-master-weekend110.out
[hadoop@weekend110 bin]$ jps
6022 DataNode
7245 HMaster
6149 SecondaryNameNode
5928 NameNode
6287 ResourceManager
6387 NodeManager
7386 Jps
[hadoop@weekend110 bin]$
依舊如此,繼續...解決!
參考博客:http://www.th7.cn/db/nosql/201510/134214.shtml
在安裝hbase-0.96.2-hadoop2時發現一個問題,hbase能夠正常使用,hbase shell 完全可用,但是60010頁面卻打不開,最后找到問題,是因為很多版本的hbase的master web 默認是不運行的,所以需要自己配置默認端口。
配置如下
在hbase-site.xml中加入一下內容即可
<property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>
<property>
<name>hbase.regionserver .info.port</name>
<value>60020</value>
</property>
<configuration>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/data/zookeeper/zkdata</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/home/hadoop/data/tmp/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>
<property>
<name>hbase.regionserver.info.port</name>
<value>60020</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://weekend110:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
進入HBase Shell
進入hbase命令行
./hbase shell
顯示hbase中的表
list
創建user表,包含info、data兩個列族
create 'user', 'info1', 'data1'
create 'user', {NAME => 'info', VERSIONS => '3'}
向user是表中插入信息,row key是rk0001,列族是info中添加name是列修飾符(標識符),值是zhangsan
put 'user', 'rk0001', 'info:name', 'zhangsan'
向user表中插入信息,row key是rk0001,列族info中添加gender列標示符,值是female
put 'user', 'rk0001', 'info:gender', 'female'
向user表中插入信息,row key是rk0001,列族info中添加age列標示符,值是20
put 'user', 'rk0001', 'info:age', 20
向user表中插入信息,row key是rk0001,列族data中添加pic列標示符,值是picture
put 'user', 'rk0001', 'data:pic', 'picture'
獲取user表中row key為rk0001的所有信息
get 'user', 'rk0001'
獲取user表中row key是rk0001,info列族的所有信息
get 'user', 'rk0001', 'info'
獲取user表中row key是rk0001,info列族的name、age列標示符的信息
get 'user', 'rk0001', 'info:name', 'info:age'
獲取user表中row key是rk0001,info、data列族的信息
get 'user', 'rk0001', 'info', 'data'
get 'user', 'rk0001', {COLUMN => ['info', 'data']}
get 'user', 'rk0001', {COLUMN => ['info:name', 'data:pic']}
獲取user表中row key是rk0001,列族是info,版本號最新5個的信息
get 'user', 'rk0001', {COLUMN => 'info', VERSIONS => 2}
get 'user', 'rk0001', {COLUMN => 'info:name', VERSIONS => 5}
get 'user', 'rk0001', {COLUMN => 'info:name', VERSIONS => 5, TIMERANGE => [1392368783980, 1392380169184]}
獲取user表中row key是rk0001,cell的值是zhangsan的信息
get 'people', 'rk0001', {FILTER => "ValueFilter(=, 'binary:圖片')"}
獲取user表中row key是rk0001,列標示符中含有a的信息
get 'people', 'rk0001', {FILTER => "(QualifierFilter(=,'substring:a'))"}
put 'user', 'rk0002', 'info:name', 'fanbingbing'
put 'user', 'rk0002', 'info:gender', 'female'
put 'user', 'rk0002', 'info:nationality', '中國'
get 'user', 'rk0002', {FILTER => "ValueFilter(=, 'binary:中國')"}
查詢user表中的所有信息
scan 'user'
查詢user表中列族為info的信息
scan 'user', {COLUMNS => 'info'}
scan 'user', {COLUMNS => 'info', RAW => true, VERSIONS => 5}
scan 'persion', {COLUMNS => 'info', RAW => true, VERSIONS => 3}
查詢user表中列族為info和data的信息
scan 'user', {COLUMNS => ['info', 'data']}
scan 'user', {COLUMNS => ['info:name', 'data:pic']}
查詢user表中列族是info、列標示符是name的信息
scan 'user', {COLUMNS => 'info:name'}
查詢user表中列族是info、列標示符是name的信息,並且版本最新的5個
scan 'user', {COLUMNS => 'info:name', VERSIONS => 5}
查詢user表中列族為info和data且列標示符中含有a字符的信息
scan 'user', {COLUMNS => ['info', 'data'], FILTER => "(QualifierFilter(=,'substring:a'))"}
查詢user表中列族為info,rk范圍是[rk0001, rk0003)的數據
scan 'people', {COLUMNS => 'info', STARTROW => 'rk0001', ENDROW => 'rk0003'}
查詢user表中row key以rk字符開頭的
scan 'user',{FILTER=>"PrefixFilter('rk')"}
查詢user表中指定范圍的數據
scan 'user', {TIMERANGE => [1392368783980, 1392380169184]}
刪除user表row key為rk0001,列標示符為info:name的數據
delete 'people', 'rk0001', 'info:name'
刪除user表row key為rk0001,列標示符為info:name,timestamp為1392383705316的數據
delete 'user', 'rk0001', 'info:name', 1392383705316
清空user表中的數據
truncate 'people'
修改表結構
首先停用user表(新版本不用)
disable 'user'
添加兩個列族f1和f2
alter 'people', NAME => 'f1'
alter 'user', NAME => 'f2'
啟用表
enable 'user'
###disable 'user'(新版本不用)
刪除一個列族:
alter 'user', NAME => 'f1', METHOD => 'delete' 或 alter 'user', 'delete' => 'f1'
添加列族f1同時刪除列族f2
alter 'user', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}
將user表的f1列族版本號改為5
alter 'people', NAME => 'info', VERSIONS => 5
啟用表
enable 'user'
刪除表
disable 'user'
drop 'user'
get 'person', 'rk0001', {FILTER => "ValueFilter(=, 'binary:中國')"}
get 'person', 'rk0001', {FILTER => "(QualifierFilter(=,'substring:a'))"}
scan 'person', {COLUMNS => 'info:name'}
scan 'person', {COLUMNS => ['info', 'data'], FILTER => "(QualifierFilter(=,'substring:a'))"}
scan 'person', {COLUMNS => 'info', STARTROW => 'rk0001', ENDROW => 'rk0003'}
scan 'person', {COLUMNS => 'info', STARTROW => '20140201', ENDROW => '20140301'}
scan 'person', {COLUMNS => 'info:name', TIMERANGE => [1395978233636, 1395987769587]}
delete 'person', 'rk0001', 'info:name'
alter 'person', NAME => 'ffff'
alter 'person', NAME => 'info', VERSIONS => 10
get 'user', 'rk0002', {COLUMN => ['info:name', 'data:pic']}
[hadoop@weekend110 bin]$ pwd
/home/hadoop/app/hbase-0.96.2-hadoop2/bin
[hadoop@weekend110 bin]$ ./hbase shell
2016-10-12 10:09:42,925 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014
hbase(main):001:0>
hbase(main):001:0> help
HBase Shell, version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS: //羅列出了所有的命令
Group name: general //通常命令,這些命令將返回集群級的通用信息。
Commands: status, table_help, version, whoami
Group name: ddl //ddl操作命令,這些命令會創建、更換和刪除HBase表
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters
Group name: namespace //namespace命令
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml //dml操作命令,這些命令會新增、修改和刪除HBase表中的數據
Commands: count, delete, deleteall, get, get_counter, incr, put, scan, truncate, truncate_preserve
Group name: tools //tools命令,這些命令可以維護HBase集群
Commands: assign, balance_switch, balancer, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, flush, hlog_roll, major_compact, merge_region, move, split, trace, unassign, zk_dump
Group name: replication //replication命令,這些命令可以增加和刪除集群的節點
Commands: add_peer, disable_peer, enable_peer, list_peers, list_replicated_tables, remove_peer
Group name: snapshot //snapshot命令,這些命令用於對HBase集群進行快照以便備份和恢復集群
Commands: clone_snapshot, delete_snapshot, list_snapshots, rename_snapshot, restore_snapshot, snapshot
Group name: security //security命令,這些命令可以控制HBase的安全性
Commands: grant, revoke, user_permission
SHELL USAGE:
Quote all names in HBase Shell such as table and column names. Commas delimit
command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:
{'key1' => 'value1', 'key2' => 'value2', ...}
and are opened and closed with curley-braces. Key/values are delimited by the
'=>' character combination. Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type
'Object.constants' to see a (messy) list of all constants in the environment.
If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:
hbase> get 't1', "key\x03\x3f\xcd"
hbase> get 't1', "key\003\023\011"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/docs/current/book.html
hbase(main):002:0>
hbase(main):002:0> version
0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014
hbase(main):003:0> create
ERROR: wrong number of arguments (0 for 1)
Here is some help for this command:
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples: //這里,有例子
Create a table with namespace=ns1 and table qualifier=t1
hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}
Create a table with namespace=default and table qualifier=t1
hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
hbase> # The above in shorthand would be the following:
hbase> create 't1', 'f1', 'f2', 'f3'
hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
Table configuration options can be put at the end.
Examples: //這里,有例子
hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
hbase> # Optionally pre-split the table into NUMREGIONS, using
hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
You can also keep around a reference to the created table:
hbase> t1 = create 't1', 'f1'
Which gives you a reference to the table named 't1', on which you can then
call methods.
hbase(main):004:0>
[hadoop@weekend110 bin]$ jps
2443 NameNode
2970 NodeManager
7515 Jps
2539 DataNode
2729 SecondaryNameNode
2866 ResourceManager
[hadoop@weekend110 bin]$ pwd
/home/hadoop/app/hbase-0.96.2-hadoop2/bin
[hadoop@weekend110 bin]$ ./start-hbase.sh
starting master, logging to /home/hadoop/app/hbase-0.96.2-hadoop2/logs/hbase-hadoop-master-weekend110.out
[hadoop@weekend110 bin]$ jps
2443 NameNode
7623 HMaster
2970 NodeManager
2539 DataNode
2729 SecondaryNameNode
2866 ResourceManager
7686 Jps
[hadoop@weekend110 bin]$ ./hbase shell
2016-10-12 15:53:46,394 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014
hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/app/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/app/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
0 row(s) in 2.8190 seconds
=> []
hbase(main):002:0> create 'mygirls', {NAME => 'base_info',VERSIONS => 3},{NAME => 'extra_info'}
0 row(s) in 1.1080 seconds
=> Hbase::Table - mygirls
hbase(main):003:0>
describe
hbase(main):010:0> describe 'mygirls'
DESCRIPTION ENABLED
'mygirls', {NAME => 'base_info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', true
REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS => '0
', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMOR
Y => 'false', BLOCKCACHE => 'true'}, {NAME => 'extra_info', DATA_BLOCK_ENCODING => 'N
ONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =>
'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLO
CKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.0980 seconds
hbase(main):011:0>
disable 和 drop ,先得disable(下線),再才能drop(刪掉)
hbase(main):002:0> disable 'mygirls'
0 row(s) in 1.2030 seconds
hbase(main):003:0> drop 'mygirls'
0 row(s) in 0.4270 seconds
hbase(main):004:0>
put
hbase(main):011:0> put
ERROR: wrong number of arguments (0 for 4)
Here is some help for this command:
Put a cell 'value' at specified table/row/column and optionally
timestamp coordinates. To put a cell value into table 'ns1:t1' or 't1'
at row 'r1' under column 'c1' marked with the time 'ts1', do:
hbase> put 'ns1:t1', 'r1', 'c1', 'value', ts1 //例子
The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:
hbase> t.put 'r1', 'c1', 'value', ts1 //例子
hbase(main):012:0> put 'mygirls','0001','base_info:name','fengjie'
0 row(s) in 0.3240 seconds
hbase(main):013:0> put 'mygirls','0001','base_info:age','18'
0 row(s) in 0.0170 seconds
hbase(main):014:0> put 'mygirls','0001','base_info:sex','jipinnvren'
0 row(s) in 0.0130 seconds
hbase(main):015:0> put 'mygirls','0001','base_info:boyfriend','huangxiaoming'
0 row(s) in 0.0590 seconds
get
hbase(main):016:0> get
ERROR: wrong number of arguments (0 for 2)
Here is some help for this command:
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples: //例子
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase> get 't1', 'r1', 'c1', 'c2'
hbase> get 't1', 'r1', ['c1', 'c2']
Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column. A user can define a FORMATTER by adding it to the column name in the get
specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifer). You cannot specify
a FORMATTER for all columns of a column family.
The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:
hbase> t.get 'r1'
hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
hbase> t.get 'r1', {COLUMN => 'c1'}
hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> t.get 'r1', 'c1'
hbase> t.get 'r1', 'c1', 'c2'
hbase> t.get 'r1', ['c1', 'c2']
hbase(main):017:0>
hbase(main):017:0> get 'mygirls','0001'
COLUMN CELL
base_info:age timestamp=1476259587999, value=18
base_info:boyfriend timestamp=1476259597469, value=huangxiaoming
base_info:name timestamp=1476259582217, value=fengjie
base_info:sex timestamp=1476259593138, value=jipinnvren
4 row(s) in 0.0320 seconds
hbase(main):018:0> put 'mygirls','0001','base_info:name','fengbaobao'
0 row(s) in 0.0140 seconds
hbase(main):019:0> get 'mygirls','0001'
COLUMN CELL
base_info:age timestamp=1476259587999, value=18
base_info:boyfriend timestamp=1476259597469, value=huangxiaoming
base_info:name timestamp=1476259871197, value=fengbaobao
base_info:sex timestamp=1476259593138, value=jipinnvren
4 row(s) in 0.0480 seconds
hbase(main):020:0> get 'mygirls','0001',{COLUMN => 'base_info:name',VERSIONS => 10}
COLUMN CELL
base_info:name timestamp=1476259871197, value=fengbaobao
base_info:name timestamp=1476259582217, value=fengjie
2 row(s) in 0.0700 seconds
得到,2個版本。意思是,最多可得到10個版本
hbase(main):021:0> put 'mygirls','0001','base_info:name','fengfeng'
0 row(s) in 0.0240 seconds
hbase(main):022:0> get 'mygirls','0001',{COLUMN => 'base_info:name',VERSIONS => 10}
COLUMN CELL
base_info:name timestamp=1476260199839, value=fengfeng
base_info:name timestamp=1476259871197, value=fengbaobao
base_info:name timestamp=1476259582217, value=fengjie
3 row(s) in 0.0550 seconds
hbase(main):023:0> put 'mygirls','0001','base_info:name','qinaidefeng'
0 row(s) in 0.0160 seconds
hbase(main):024:0> get 'mygirls','0001',{COLUMN => 'base_info:name',VERSIONS => 10}
COLUMN CELL
base_info:name timestamp=1476260274142, value=qinaidefeng
base_info:name timestamp=1476260199839, value=fengfeng
base_info:name timestamp=1476259871197, value=fengbaobao
3 row(s) in 0.0400 seconds
只存,最近的3個版本。
scan
hbase(main):025:0> scan
ERROR: wrong number of arguments (0 for 1)
Here is some help for this command:
Scan a table; pass table name and optionally a dictionary of scanner
specifications. Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,
or COLUMNS, CACHE
If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family:'.
The filter can be specified in two ways:
1. Using a filterString - more information on this is available in the
Filter Language document attached to the HBASE-4176 JIRA
2. Using the entire package name of the filter.
Some examples: //例子
hbase> scan 'hbase:meta'
hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND
(QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}
hbase> scan 't1', {FILTER =>
org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false). By
default it is enabled. Examples:
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
Also for experts, there is an advanced option -- RAW -- which instructs the
scanner to return all cells (including delete markers and uncollected deleted
cells). This option cannot be combined with requesting specific COLUMNS.
Disabled by default. Example:
hbase> scan 't1', {RAW => true, VERSIONS => 10}
Besides the default 'toStringBinary' format, 'scan' supports custom formatting
by column. A user can define a FORMATTER by adding it to the column name in
the scan specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifer). You cannot
specify a FORMATTER for all columns of a column family.
Scan can also be used directly from a table, by first getting a reference to a
table, like such:
hbase> t = get_table 't'
hbase> t.scan
Note in the above situation, you can still provide all the filtering, columns,
options, etc as described above.
hbase(main):026:0>
hbase(main):026:0> scan 'mygirls'
ROW COLUMN+CELL
0001 column=base_info:age, timestamp=1476259587999, value=18
0001 column=base_info:boyfriend, timestamp=1476259597469, value=huangxiaoming
0001 column=base_info:name, timestamp=1476260274142, value=qinaidefeng
0001 column=base_info:sex, timestamp=1476259593138, value=jipinnvren
1 row(s) in 0.1040 seconds
hbase(main):027:0> scan 'mygirls',{RAW => true,VERSIONS => 10}
ROW COLUMN+CELL
0001 column=base_info:age, timestamp=1476259587999, value=18
0001 column=base_info:boyfriend, timestamp=1476259597469, value=huangxiaoming
0001 column=base_info:name, timestamp=1476260274142, value=qinaidefeng
0001 column=base_info:name, timestamp=1476260199839, value=fengfeng
0001 column=base_info:name, timestamp=1476259871197, value=fengbaobao
0001 column=base_info:name, timestamp=1476259582217, value=fengjie
0001 column=base_info:sex, timestamp=1476259593138, value=jipinnvren
1 row(s) in 0.0660 seconds
hbase(main):028:0> get 'mygirls','0001',{COLUMN => 'base_info:name',VERSIONS => 10,RAW => true}
COLUMN CELL
base_info:name timestamp=1476260274142, value=qinaidefeng
base_info:name timestamp=1476260199839, value=fengfeng
base_info:name timestamp=1476259871197, value=fengbaobao
3 row(s) in 0.0170 seconds
為什么,scan能把老版本都顯示出來?
答:原因是,你最新的操作,並沒有真正刷到Hadoop文件里。
最新的操作,是在HLog里。
要等到,把這個HLog刷到MeroStore里或HFile里,才能把那些過期的數據給清除掉。
那多久才會刷呢?有個默認機制。 還有,立即把HBase停掉。
其實啊,HBase Shell操作,在平常工作中,都不會這么操作。因為,不建議,很不好用,不方便。!!!
hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND
(QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}
hbase> scan 't1', {FILTER =>
org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
更多,自行去玩玩。這里,不多贅述。
進入hbase命令行
./hbase shell
顯示hbase中的表
list
創建user表,包含info、data兩個列族
create 'user', 'info1', 'data1'
create 'user', {NAME => 'info', VERSIONS => '3'}
向user表中插入信息,row key為rk0001,列族info中添加name列標示符,值為zhangsan
put 'user', 'rk0001', 'info:name', 'zhangsan'
向user表中插入信息,row key為rk0001,列族info中添加gender列標示符,值為female
put 'user', 'rk0001', 'info:gender', 'female'
向user表中插入信息,row key為rk0001,列族info中添加age列標示符,值為20
put 'user', 'rk0001', 'info:age', 20
向user表中插入信息,row key為rk0001,列族data中添加pic列標示符,值為picture
put 'user', 'rk0001', 'data:pic', 'picture'
獲取user表中row key為rk0001的所有信息
get 'user', 'rk0001'
獲取user表中row key為rk0001,info列族的所有信息
get 'user', 'rk0001', 'info'
獲取user表中row key為rk0001,info列族的name、age列標示符的信息
get 'user', 'rk0001', 'info:name', 'info:age'
獲取user表中row key為rk0001,info、data列族的信息
get 'user', 'rk0001', 'info', 'data'
get 'user', 'rk0001', {COLUMN => ['info', 'data']}
get 'user', 'rk0001', {COLUMN => ['info:name', 'data:pic']}
獲取user表中row key為rk0001,列族為info,版本號最新5個的信息
get 'user', 'rk0001', {COLUMN => 'info', VERSIONS => 2}
get 'user', 'rk0001', {COLUMN => 'info:name', VERSIONS => 5}
get 'user', 'rk0001', {COLUMN => 'info:name', VERSIONS => 5, TIMERANGE => [1392368783980, 1392380169184]}
獲取user表中row key為rk0001,cell的值為zhangsan的信息
get 'people', 'rk0001', {FILTER => "ValueFilter(=, 'binary:圖片')"}
獲取user表中row key為rk0001,列標示符中含有a的信息
get 'people', 'rk0001', {FILTER => "(QualifierFilter(=,'substring:a'))"}
put 'user', 'rk0002', 'info:name', 'fanbingbing'
put 'user', 'rk0002', 'info:gender', 'female'
put 'user', 'rk0002', 'info:nationality', '中國'
get 'user', 'rk0002', {FILTER => "ValueFilter(=, 'binary:中國')"}
查詢user表中的所有信息
scan 'user'
查詢user表中列族為info的信息
scan 'user', {COLUMNS => 'info'}
scan 'user', {COLUMNS => 'info', RAW => true, VERSIONS => 5}
scan 'persion', {COLUMNS => 'info', RAW => true, VERSIONS => 3}
查詢user表中列族為info和data的信息
scan 'user', {COLUMNS => ['info', 'data']}
scan 'user', {COLUMNS => ['info:name', 'data:pic']}
查詢user表中列族為info、列標示符為name的信息
scan 'user', {COLUMNS => 'info:name'}
查詢user表中列族為info、列標示符為name的信息,並且版本最新的5個
scan 'user', {COLUMNS => 'info:name', VERSIONS => 5}
查詢user表中列族為info和data且列標示符中含有a字符的信息
scan 'user', {COLUMNS => ['info', 'data'], FILTER => "(QualifierFilter(=,'substring:a'))"}
查詢user表中列族為info,rk范圍是[rk0001, rk0003)的數據
scan 'people', {COLUMNS => 'info', STARTROW => 'rk0001', ENDROW => 'rk0003'}
查詢user表中row key以rk字符開頭的
scan 'user',{FILTER=>"PrefixFilter('rk')"}
查詢user表中指定范圍的數據
scan 'user', {TIMERANGE => [1392368783980, 1392380169184]}
刪除數據
刪除user表row key為rk0001,列標示符為info:name的數據
delete 'people', 'rk0001', 'info:name'
刪除user表row key為rk0001,列標示符為info:name,timestamp為1392383705316的數據
delete 'user', 'rk0001', 'info:name', 1392383705316
清空user表中的數據
truncate 'people'
修改表結構
首先停用user表(新版本不用)
disable 'user'
添加兩個列族f1和f2
alter 'people', NAME => 'f1'
alter 'user', NAME => 'f2'
啟用表
enable 'user'
###disable 'user'(新版本不用)
刪除一個列族:
alter 'user', NAME => 'f1', METHOD => 'delete' 或 alter 'user', 'delete' => 'f1'
添加列族f1同時刪除列族f2
alter 'user', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}
將user表的f1列族版本號改為5
alter 'people', NAME => 'info', VERSIONS => 5
啟用表
enable 'user'
刪除表
disable 'user'
drop 'user'
get 'person', 'rk0001', {FILTER => "ValueFilter(=, 'binary:中國')"}
get 'person', 'rk0001', {FILTER => "(QualifierFilter(=,'substring:a'))"}
scan 'person', {COLUMNS => 'info:name'}
scan 'person', {COLUMNS => ['info', 'data'], FILTER => "(QualifierFilter(=,'substring:a'))"}
scan 'person', {COLUMNS => 'info', STARTROW => 'rk0001', ENDROW => 'rk0003'}
scan 'person', {COLUMNS => 'info', STARTROW => '20140201', ENDROW => '20140301'}
scan 'person', {COLUMNS => 'info:name', TIMERANGE => [1395978233636, 1395987769587]}
delete 'person', 'rk0001', 'info:name'
alter 'person', NAME => 'ffff'
alter 'person', NAME => 'info', VERSIONS => 10
get 'user', 'rk0002', {COLUMN => ['info:name', 'data:pic']}
hbase的java api
1、建立hbase工程
推薦http://hbase.apache.org/
.關於HBase的更多技術細節,強烈必多看
http://abloz.com/hbase/book.html
2、 weekend110-hbase -> Build Path -> Configure Build Path
我們是hadoop-2.4.1.jar,但是,hbase-0.96.2-hadoop2-bin.tar.gz自帶的是,hadoop-2.2.0.jar。
這里,我參考了《Hadoop 實戰》,陸嘉恆老師編著的。P249頁,
注意:安裝Hadoop的時候,要注意HBase的版本。也就是說,需要注意Hadoop和HBase之間的版本關系,如果不匹配,很可能會影響HBase系統的穩定性。在HBase的lin目錄下可以看到對應的Hadoop的JAR文件。默認情況下,HBase的lib文件夾下對應的Hadoop版本相對穩定。如果用戶想要使用其他的Hadoop版本,那么需要將Hadoop系統安裝目錄hadoop-*.*.*-core.jar文件和hadoop-*.*.*-test.jar文件復制到HBase的lib文件夾下,以替換其他版本的Hadoop文件。
這里,我們為了方便,直接把D:\SoftWare\hbase-0.96.2-hadoop2\lib的所有jar包,都弄進來。
也參考了網上一些博客資料說,不需這么多。此外,程序可能包含一些間接引用,以后再逐步逐個,下載,添加就是。復制粘貼到 hbase-0.96.2-hadoop2lib 里。
參考我的博客
Eclipse下新建Maven項目、自動打依賴jar包
參考 : http://blog.csdn.net/hetianguang/article/details/51371713
http://www.cnblogs.com/NicholasLee/archive/2012/09/13/2683432.html
3、新建包cn.itcast.bigdata.hbase
4、新建類 HbaseDao.java
這里,我就以分布式集群的配置,附上代碼。工作中,就是這么干的!
package cn.itcast.bigdata.hbase;
import java.io.IOException;
import java.util.ArrayList;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.Test;
public class HbaseDao {
@Test
public void insertTest() throws Exception{
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "weekend05:2181,weekend06:2181,weekend07:2181");
HTable nvshen = new HTable(conf, "nvshen");
Put name = new Put(Bytes.toBytes("rk0001"));
name.add(Bytes.toBytes("base_info"), Bytes.toBytes("name"), Bytes.toBytes("angelababy"));
Put age = new Put(Bytes.toBytes("rk0001"));
age.add(Bytes.toBytes("base_info"), Bytes.toBytes("age"), Bytes.toBytes(18));
ArrayList<Put> puts = new ArrayList<>();
puts.add(name);
puts.add(age);
nvshen.put(puts);
}
public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "weekend05:2181,weekend06:2181,weekend07:2181");
HBaseAdmin admin = new HBaseAdmin(conf);
TableName name = TableName.valueOf("nvshen");
HTableDescriptor desc = new HTableDescriptor(name);
HColumnDescriptor base_info = new HColumnDescriptor("base_info");
HColumnDescriptor extra_info = new HColumnDescriptor("extra_info");
base_info.setMaxVersions(5);
desc.addFamily(base_info);
desc.addFamily(extra_info);
admin.createTable(desc);
}
}
或者 HbaseDemo.java
package cn.itcast.bigdata.hbase;
import java.io.IOException;
import java.util.ArrayList;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.Test;
public class HbaseDao {
@Test
public void insertTest() throws Exception{
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "weekend05:2181,weekend06:2181,weekend07:2181");
HTable nvshen = new HTable(conf, "nvshen");
Put name = new Put(Bytes.toBytes("rk0001"));
name.add(Bytes.toBytes("base_info"), Bytes.toBytes("name"), Bytes.toBytes("angelababy"));
Put age = new Put(Bytes.toBytes("rk0001"));
age.add(Bytes.toBytes("base_info"), Bytes.toBytes("age"), Bytes.toBytes(18));
ArrayList<Put> puts = new ArrayList<>();
puts.add(name);
puts.add(age);
nvshen.put(puts);
}
public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "weekend05:2181,weekend06:2181,weekend07:2181");
HBaseAdmin admin = new HBaseAdmin(conf);
TableName name = TableName.valueOf("nvshen");
HTableDescriptor desc = new HTableDescriptor(name);
HColumnDescriptor base_info = new HColumnDescriptor("base_info");
HColumnDescriptor extra_info = new HColumnDescriptor("extra_info");
base_info.setMaxVersions(5);
desc.addFamily(base_info);
desc.addFamily(extra_info);
admin.createTable(desc);
}
}