功能: Mysql數據遷移至Hbase, 通過Phoenix框架作為操作Hbase的CRUD層, 平滑過渡sql語法使用, 解決Mysql單表數據量過大導致查詢效率低的問題.
Hbase的性能自然不用多說, PB級別的數據處理起來也沒什么問題. 看一下Phoenix官方給出的性能介紹: http://phoenix.apache.org/faq.html#How_fast_is_Phoenix_Why_is_it_so_fast
1. Hbase Phoenix 環境安裝
Hbase環境安裝屬於大數據領域的知識, Java開發程序員一般不太熟悉, 尤其還要涉及到Hadoop等框架的知識, 安裝過程特別繁瑣, 容易砸鍵盤.
所以最好采用docker方式安裝, 部署方便, 開發測試的話也夠用了. 上生產就交給大牛和運維吧! 你要是大牛, 你必須上 ^_^
docker安裝和docker-compose使用, 自行了解.
環境部署:【 原文地址: https://gitee.com/assad/docker-hbase 】
下載倉庫
git clone https://gitee.com/assad/docker-hbase.git
docker-compose 構建基礎鏡像。
make build
運行 Standalone 版本
Standalone 版本依賴結構與 quickstart HBase documentation 一致,只不過使用了 Hadoop 的 HDFS 版本作為底層儲存。主要用於開發測試環境的快速搭建。
docker-compose -f docker-compose-standalone.yml up -d
運行 Local Distributed 偽分布式版本
運行該版本會使用一個獨立的 Zookeeper,HMaster 節點和 HResgionserver 節點也會分開為獨立容器。
docker-compose -f docker-compose-distributed-local.yml up -d
查看docker運行狀態: docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Ports}}\t{{.Status}}"
CONTAINER ID NAMES PORTS STATUS
82523e8c14a7 nodemanager 8042/tcp Up 5 hours (healthy)
bf81f61f2df9 datanode 50075/tcp Up 5 hours (healthy)
1a9e2f1439e2 namenode 50070/tcp Up 5 hours (healthy)
1473a0330fc6 historyserver 8188/tcp Up 5 hours (healthy)
65c4e55c4c02 resourcemanager 8088/tcp Up 5 hours (healthy)
977b7ad4f4ef hbase 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp, 0.0.0.0:16000->16000/tcp, 0.0.0.0:16010->16010/tcp, 0.0.0.0:16020->16020/tcp, 0.0.0.0:16030->16030/tcp Up 5 hours
查看hbase容器日志: docker logs -f hbase
進入hbase容器base: docker exec -it hbase /bin/bash
服務器,添加host映射 vi /etc/hosts
服務器ip hbase
訪問hbase管理頁面: http://服務器ip:16010/master-status
2. Hbase shell 使用
進入hbase容器base: docker exec -it hbase /bin/bash
啟動hbase shell客戶端: hbase shell
[root@hbase ~]# docker exec -it hbase /bin/bash
root@hbase:/# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0031 seconds
hbase(main):001:0>
基本操作:
- 創建表
create <tableName> , <colFamily> [ ,<colFamily1> ,... ,<colFamilyN> ]
- 添加數據
put <tableName>, <rowKey> , < colFamily : col >,value [ , stampTime ]
- 查詢數據
get <tableName>, <rowKey> , < colFamily : col >
- 查詢所有數據
scan <tableName>
hbase(main):002:0> create 'user', 'info'
Created table user
Took 2.6171 seconds
=> Hbase::Table - user
hbase(main):003:0> put 'user', '001', 'info:name', 'linyufeng01'
Took 0.2252 seconds
hbase(main):004:0> get 'user', '001'
COLUMN CELL
info:name timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0576 seconds
hbase(main):005:0> put 'user', '001', 'info:age', '12'
Took 0.0079 seconds
hbase(main):006:0> get 'user', '001'
COLUMN CELL
info:age timestamp=1623618951968, value=12
info:name timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0102 seconds
hbase(main):007:0> get 'user', '001', 'info:name'
COLUMN CELL
info:name timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0084 seconds
hbase(main):008:0> scan 'user'
ROW COLUMN+CELL
001 column=info:age, timestamp=1623618951968, value=12
001 column=info:name, timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0138 seconds
hbase(main):009:0>
4. Phoenix shell 使用
下載 http://www.apache.org/dyn/closer.lua/phoenix/apache-phoenix-5.0.0-HBase-2.0/bin/apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz
解壓 tar -zxvf apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz
, 進入bin目錄
啟動 Phoenix shell 客戶端 ./sqlline.py localhost
[root@hbase bin]# ./sqlline.py localhost
0: jdbc:phoenix:localhost> !tables
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | INDEX_STATE | IMMUTABLE_ROWS | SALT_BUCKETS | MULTI_TENANT | |
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
| | | IDX_T_LOG | INDEX | | | | | ACTIVE | false | 16 | false | |
| | | IDX_T_LOG_UID | INDEX | | | | | ACTIVE | false | 16 | false | |
| | SYSTEM | CATALOG | SYSTEM TABLE | | | | | | false | null | false | |
| | SYSTEM | FUNCTION | SYSTEM TABLE | | | | | | false | null | false | |
| | SYSTEM | LOG | SYSTEM TABLE | | | | | | true | 32 | false | |
| | SYSTEM | SEQUENCE | SYSTEM TABLE | | | | | | false | null | false | |
| | SYSTEM | STATS | SYSTEM TABLE | | | | | | false | null | false | |
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
0: jdbc:phoenix:localhost>
基本sql語法
CREATE TABLE IF NOT EXISTS t_user (
name VARCHAR(20) NOT NULL,
age integer(3) NOT NULL,
address VARCHAR,
updatetime timestamp,
inserttime timestamp
CONSTRAINT pk_name_age PRIMARY KEY (name, age)// 聯合主鍵
);
// 添加和更新數據都用 upsert, 沒有 insert 和 update
UPSERT INTO t_user (name, age, address) values ('test01',12,'上海市松江區');
UPSERT INTO t_user (name, age, address) values ('test02',13,'上海市金山區');
SELECT * FROM t_user;
SELECT COUNT(*) FROM t_user;
0: jdbc:phoenix:localhost> SELECT * FROM t_user;
+---------+------+----------+-------------+-------------+
| NAME | AGE | ADDRESS | UPDATETIME | INSERTTIME |
+---------+------+----------+-------------+-------------+
| test01 | 12 | 上海市松江區 | | |
| test02 | 13 | 上海市金山區 | | |
+---------+------+----------+-------------+-------------+
4. Hbase Phoenix 數據庫連接工具
可視化數據庫連接工具勢必可少的, 就像Navicat
一樣. 這里可以使用 squirrel sql
和 Dbeaver
推薦Dbeaver V6.1.5, 高版本需要Jdk11才能運行
1. 下載安裝
下載頁面: https://github.com/dbeaver/dbeaver/releases?after=6.2.0
windows 64位下載地址: https://github.com/dbeaver/dbeaver/releases/download/6.1.5/dbeaver-ce-6.1.5-x86_64-setup.exe
安裝完成后, 如果啟動報錯, 在安裝目錄找到 dbeaver.ini
文件, 在首行添加本機jdk安裝目錄:
-vm
C:\Program Files\Java\jdk1.8.0_201\bin // 以本機為准
復制 phoenix-5.0.0-HBase-2.0-client.jar
到根目錄, 后邊需要加載驅動
2. 配置連接
新建連接
配置驅動
選擇本地jar包驅動, 不要使用自動下載的Hadoop Phoenix驅動
連接測試
5. SpringBoot集成
樂意用MyBatis就用, 用MyBatisPlus也行
依賴:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-boot-starter</artifactId>
<version>3.1.0</version>
</dependency>
<!--phoenix core-->
<dependency>
<groupId>org.apache.phoenix</groupId>
<artifactId>phoenix-core</artifactId>
<version>5.0.0-HBase-2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.9.2</version>
</dependency>
配置:
spring.datasource.driver-class-name=org.apache.phoenix.jdbc.PhoenixDriver
spring.datasource.url=jdbc:phoenix:服務器ip:2181
spring.datasource.username=
spring.datasource.password=
官方說,不需要數據庫連接池 http://phoenix.apache.org/faq.html#Should_I_pool_Phoenix_JDBC_Connections
其他配置跟普通項目一樣, bean, dao, service 等等該怎么建就怎么建, 都一樣.
唯一的區別就是, 在Phoenix中, 把 insert
和 update
語句合並成了 upsert
語句. 不要再寫 insert into
, 用 upsert into
來代替
Bean
@Data
@ToString
@TableName("t_user")
public class User {
private String name;
private Integer age;
private String address;
private Date inserttime;
private Date updatetime;
}
dao
@Mapper
public interface UserMapper {
@Select("select * from t_user")
List<User> list();
@Insert("UPSERT INTO t_user (name, age, address, inserttime, updatetime) " +
"values (#{name}, #{age}, #{address}, #{inserttime}, #{updatetime})")
void insert(User user);
}
配置Hadoop環境
參考: 【 https://blog.csdn.net/weixin_41122339/article/details/81141913 】
也可以采用非Hadoop方案
在項目根目錄添加bin
目錄, 拷貝winutils.exe
到此目錄, 修改 SpringBootApplication
@SpringBootApplication
public class HbasephoenixApplication {
public static void main(String[] args) {
// 或設置 HADOOP_HOME
System.setProperty("hadoop.home.dir", System.getProperty("user.dir"));
SpringApplication.run(HbasephoenixApplication.class, args);
}
}
單元測試
@SpringBootTest
@RunWith(SpringJUnit4ClassRunner.class)
class HbasephoenixApplicationTests {
@Autowired
private UserMapper userMapper;
@Test
void list() {
List<User> users = userMapper.list();
users.forEach(System.out::println);
}
@Test
void add() {
User user = new User();
user.setName("test03");
user.setAge(14);
user.setAddress("SHANGHAI SONGJIANG");
user.setInserttime(new Date());
user.setUpdatetime(new Date());
userMapper.insert(user);
list();
}
}
Bug
1. null\bin\winutils.exe
https://blog.csdn.net/weixin_41122339/article/details/81141913
2. certificate has expired or is not yet valid
https://blog.csdn.net/qq_39562468/article/details/84592139
3. spark-phoenix 一直出現一個 bug?
http://www.hainiubl.com/topics/36024
參考: