一、前言
一般來說,隨着業務的發展數據庫的數據量會越來越多,當單表數據超過上千萬時執行一些查詢sql語句就會遇到性能問題。一開始可以用主從復制讀寫分離來減輕db壓力,但是后面還是要用分庫分表把數據進行水平拆分和垂直拆分。
實現分庫分表目前我知道的方式有兩種,第一種是使用mycat中間件實現,第二種是使用sharding-jdbc實現。相比較而言,sharding-jdbc引入一個jar包即可使用更輕量級一些,它們之間的優缺點這里也不做比較,有興趣的可以自己搜索相關資料。
不清楚分庫分表原理的可以參考這篇博客,數據庫之分庫分表-垂直?水平?
二、使用當當網的sharding-jdbc分庫分表
2.1新建SpringBoot項目
新建項目sharding-jdbc-first,並在pom文件添加如下內容:
- <parent>
- <groupId>org.springframework.boot</groupId>
- <artifactId>spring-boot-starter-parent</artifactId>
- <version>1.5.16.RELEASE</version>
- <relativePath/> <!-- lookup parent from repository -->
- </parent>
- <properties>
- <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
- <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
- <java.version>1.8</java.version>
- </properties>
- <dependencies>
- <dependency>
- <groupId>org.springframework.boot</groupId>
- <artifactId>spring-boot-starter-data-jpa</artifactId>
- </dependency>
- <dependency>
- <groupId>org.springframework.boot</groupId>
- <artifactId>spring-boot-starter-web</artifactId>
- </dependency>
- <dependency>
- <groupId>mysql</groupId>
- <artifactId>mysql-connector-java</artifactId>
- <scope>runtime</scope>
- </dependency>
- <dependency>
- <groupId>org.springframework.boot</groupId>
- <artifactId>spring-boot-starter-test</artifactId>
- <scope>test</scope>
- </dependency>
- <dependency>
- <groupId>com.dangdang</groupId>
- <artifactId>sharding-jdbc-core</artifactId>
- <version>1.4.2</version>
- </dependency>
- <dependency>
- <groupId>com.alibaba</groupId>
- <artifactId>druid</artifactId>
- <version>1.0.12</version>
- </dependency>
- <dependency>
- <groupId>com.dangdang</groupId>
- <artifactId>sharding-jdbc-self-id-generator</artifactId>
- <version>1.4.2</version>
- </dependency>
- </dependencies>
目前好像不支持SpringBoot2.0以上的版本。
2.2編寫實體類及建庫建表
目標:
db0
├── t_order_0 user_id為偶數 order_id為偶數
├── t_order_1 user_id為偶數 order_id為奇數
db1
├── t_order_0 user_id為奇數 order_id為偶數
├── t_order_1 user_id為奇數 order_id為奇數
- 創建兩個數據庫 ds_0 和 ds_1,編碼類型UTF-8。
- 每個庫分表創建兩個表t_order_0和t_order_1,sql語句如下:
DROP TABLE IF EXISTS t_order_0;
CREATE TABLE t_order_0 (
order_id bigint(20) NOT NULL,
user_id bigint(20) NOT NULL,
PRIMARY KEY (order_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin; - 新建類Order,代碼如下
- package cn.sp.bean;
- import javax.persistence.Entity;
- import javax.persistence.Id;
- import javax.persistence.Table;
- /**
- * Created by 2YSP on 2018/9/23.
- */
- "t_order") (name=
- public class Order {
- private Long orderId;
- private Long userId;
- public Long getOrderId() {
- return orderId;
- }
- public void setOrderId(Long orderId) {
- this.orderId = orderId;
- }
- public Long getUserId() {
- return userId;
- }
- public void setUserId(Long userId) {
- this.userId = userId;
- }
- }
這里需要注意 @Id注解不要導錯包,之前我就遇到過這個問題。
4.配置文件application.yml
- server:
- port: 8000
- spring:
- jpa:
- database: mysql
- show-sql: true
- hibernate:
- ## 自己建表
- ddl-auto: none
- application:
- name: sharding-jdbc-first
這里要注意的是spring-data-jpa默認會自己建表,這里我們要手動建立,所以需要將ddl-auto屬性設置為none。
2.3自定義分庫分表算法
1.分庫算法類需要實現SingleKeyDatabaseShardingAlgorithm<T>接口,這是一個泛型接口,T代表分庫依據的字段的類型,比如我們根據userId%2來分庫,userId是Long型的,這里的T就是Long。
- public class ModuloDatabaseShardingAlgorithm implements SingleKeyDatabaseShardingAlgorithm<Long> {
- public String doEqualSharding(Collection<String> availableDatabaseNames, ShardingValue<Long> shardingValue) {
- for(String databaseName : availableDatabaseNames){
- if (databaseName.endsWith(shardingValue.getValue() % 2 + "")){
- return databaseName;
- }
- }
- throw new IllegalArgumentException();
- }
- public Collection<String> doInSharding(Collection<String> availableDatabaseNames, ShardingValue<Long> shardingValue) {
- Collection<String> result = new LinkedHashSet<>(availableDatabaseNames.size());
- for(Long value : shardingValue.getValues()){
- for(String name : availableDatabaseNames){
- if (name.endsWith(value%2 + "")){
- result.add(name);
- }
- }
- }
- return result;
- }
- public Collection<String> doBetweenSharding(Collection<String> availableDatabaseNames, ShardingValue<Long> shardingValue) {
- Collection<String> result = new LinkedHashSet<>(availableDatabaseNames.size());
- Range<Long> range = shardingValue.getValueRange();
- for(Long i = range.lowerEndpoint() ; i < range.upperEndpoint();i++){
- for(String each : availableDatabaseNames){
- if (each.endsWith( i % 2+"")){
- result.add(each);
- }
- }
- }
- return result;
- }
- }
2.分表算法類需要實現SingleKeyTableShardingAlgorithm<T>接口。
- /**
- * 表分片算法
- * Created by 2YSP on 2018/9/23.
- */
- public class ModuloTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<Long> {
- /**
- * select * from t_order from t_order where order_id = 11
- * └── SELECT * FROM t_order_1 WHERE order_id = 11
- * select * from t_order from t_order where order_id = 44
- * └── SELECT * FROM t_order_0 WHERE order_id = 44
- */
- public String doEqualSharding(Collection<String> tableNames, ShardingValue<Long> shardingValue) {
- for (String tableName : tableNames) {
- if (tableName.endsWith(shardingValue.getValue() % 2 + "")) {
- return tableName;
- }
- }
- throw new IllegalArgumentException();
- }
- /**
- * select * from t_order from t_order where order_id in (11,44)
- * ├── SELECT * FROM t_order_0 WHERE order_id IN (11,44)
- * └── SELECT * FROM t_order_1 WHERE order_id IN (11,44)
- * select * from t_order from t_order where order_id in (11,13,15)
- * └── SELECT * FROM t_order_1 WHERE order_id IN (11,13,15)
- * select * from t_order from t_order where order_id in (22,24,26)
- * └──SELECT * FROM t_order_0 WHERE order_id IN (22,24,26)
- */
- public Collection<String> doInSharding(Collection<String> tableNames, ShardingValue<Long> shardingValue) {
- Collection<String> result = new LinkedHashSet<>(tableNames.size());
- for (Long value : shardingValue.getValues()) {
- for (String table : tableNames) {
- if (table.endsWith(value % 2 + "")) {
- result.add(table);
- }
- }
- }
- return result;
- }
- /**
- * select * from t_order from t_order where order_id between 10 and 20
- * ├── SELECT * FROM t_order_0 WHERE order_id BETWEEN 10 AND 20
- * └── SELECT * FROM t_order_1 WHERE order_id BETWEEN 10 AND 20
- */
- public Collection<String> doBetweenSharding(Collection<String> tableNames, ShardingValue<Long> shardingValue) {
- Collection<String> result = new LinkedHashSet<>(tableNames.size());
- Range<Long> range = shardingValue.getValueRange();
- for (Long i = range.lowerEndpoint(); i < range.upperEndpoint(); i++) {
- for (String each : tableNames) {
- if (each.endsWith(i % 2 + "")) {
- result.add(each);
- }
- }
- }
- return result;
- }
- }
2.4配置數據源
數據源配置類DataSourceConfig
- public class DataSourceConfig {
- public IdGenerator getIdGenerator(){
- return new CommonSelfIdGenerator();
- }
- public DataSource getDataSource() {
- return buildDataSource();
- }
- private DataSource buildDataSource() {
- //1.設置分庫映射
- Map<String, DataSource> dataSourceMap = new HashMap<>(2);
- dataSourceMap.put("ds_0", createDataSource("ds_0"));
- dataSourceMap.put("ds_1", createDataSource("ds_1"));
- //設置默認db為ds_0,也就是為那些沒有配置分庫分表策略的指定的默認庫
- //如果只有一個庫,也就是不需要分庫的話,map里只放一個映射就行了,只有一個庫時不需要指定默認庫,
- // 但2個及以上時必須指定默認庫,否則那些沒有配置策略的表將無法操作數據
- DataSourceRule rule = new DataSourceRule(dataSourceMap, "ds_0");
- //2.設置分表映射,將t_order_0和t_order_1兩個實際的表映射到t_order邏輯表
- TableRule orderTableRule = TableRule.builder("t_order")
- .actualTables(Arrays.asList("t_order_0", "t_order_1"))
- .dataSourceRule(rule)
- .build();
- //3.具體的分庫分表策略
- ShardingRule shardingRule = ShardingRule.builder()
- .dataSourceRule(rule)
- .tableRules(Arrays.asList(orderTableRule))
- .databaseShardingStrategy(new DatabaseShardingStrategy("user_id", new ModuloDatabaseShardingAlgorithm()))
- .tableShardingStrategy(new TableShardingStrategy("order_id", new ModuloTableShardingAlgorithm()))
- .build();
- DataSource dataSource = ShardingDataSourceFactory.createDataSource(shardingRule);
- return dataSource;
- }
- private static DataSource createDataSource(String dataSourceName) {
- //使用druid連接數據庫
- DruidDataSource druidDataSource = new DruidDataSource();
- druidDataSource.setDriverClassName("com.mysql.jdbc.Driver");
- druidDataSource.setUrl(String.format("jdbc:mysql://localhost:3306/%s?characterEncoding=utf-8", dataSourceName));
- druidDataSource.setUsername("root");
- druidDataSource.setPassword("1234");
- return druidDataSource;
- }
- }
這里的一些配置信息url,username,password等可以優化下,從配置文件讀取。
2.5測試
1.新建OrderRepository
- public interface OrderRepository extends CrudRepository<Order,Long> {
- }
2.controller層
- /**
- * Created by 2YSP on 2018/9/23.
- */
- "/order") (
- public class OrderController {
- private OrderRepository repository;
- private IdGenerator idGenerator;
- "/add") (
- public String add(){
- for(int i=0;i<10;i++){
- Order order = new Order();
- order.setOrderId((long) i);
- order.setUserId((long) i);
- repository.save(order);
- }
- // Order order = new Order();
- // order.setUserId(1L);
- // order.setOrderId(idGenerator.generateId().longValue());
- // repository.save(order);
- return "success";
- }
- "/query") (
- public List<Order> queryAll(){
- List<Order> orders = (List<Order>) repository.findAll();
- return orders;
- }
- }
3.訪問http://localhost:8080/order/add,即可在數據庫ds_0,ds_1發現多了一些數據。
訪問http://localhost:8080/order/query可以查詢剛剛添加的訂單數據。
完整代碼地址:https://github.com/2YSP/sharding-jdbc-first
三、使用sharding-jdbc-spring-boot-starter分庫分表
3.1引入依賴
因為我的SpringBoot是2.X版本,所以引入最新的依賴。因為目前的maven倉庫(包括阿里倉庫)還沒有對應的jar,需要自己去github下載源代碼,然后執行 mvn clean install打包到本地maven倉庫。
- <dependency>
- <groupId>io.shardingsphere</groupId>
- <artifactId>sharding-jdbc-spring-boot-starter</artifactId>
- <version>3.0.0.M4</version>
- </dependency>
3.2SpringBoot配置
在application.properties文件添加如下內容:
##########分庫分表配置##########
sharding.jdbc.datasource.names=ds0,ds1
## 這里使用阿里的Druid連接池
sharding.jdbc.datasource.ds0.type=com.alibaba.druid.pool.DruidDataSource
sharding.jdbc.datasource.ds0.driver-class-name=com.mysql.jdbc.Driver
sharding.jdbc.datasource.ds0.url=jdbc:mysql://localhost:3306/ds_0
sharding.jdbc.datasource.ds0.username=root
sharding.jdbc.datasource.ds0.password=1234
sharding.jdbc.datasource.ds1.type=com.alibaba.druid.pool.DruidDataSource
sharding.jdbc.datasource.ds1.driver-class-name=com.mysql.jdbc.Driver
sharding.jdbc.datasource.ds1.url=jdbc:mysql://localhost:3306/ds_1
sharding.jdbc.datasource.ds1.username=root
sharding.jdbc.datasource.ds1.password=1234
##默認的分庫策略:user_id為奇數-->數據庫ds_1,user_id為偶數-->數據庫ds_0
sharding.jdbc.config.sharding.default-database-strategy.inline.sharding-column=user_id
sharding.jdbc.config.sharding.default-database-strategy.inline.algorithm-expression=ds$->{user_id % 2}
## 這里的t_order是邏輯表,由數據源名 + 表名組成,以小數點分隔。多個表以逗號分隔,支持inline表達式
sharding.jdbc.config.sharding.tables.t_order.actual-data-nodes=ds$->{0..1}.t_order_$->{0..1}
## 行表達式分片策略
sharding.jdbc.config.sharding.tables.t_order.table-strategy.inline.sharding-column=order_id
sharding.jdbc.config.sharding.tables.t_order.table-strategy.inline.algorithm-expression=t_order_$->{order_id % 2}
這里還可以用Java配置,Yaml配置來代替,感興趣的話可以訪問github地址了解更多,上面有對應的中文文檔。
四、總結
在分庫分表的時候要根據實際情況來決定根據哪個字段來分(不一定都是主鍵),需要分幾個庫幾張表。
分庫分表后遇到的問題:
1.不能像以前一樣使用數據庫自增的主鍵了,會出現主鍵重復的問題(可以使用分布式主鍵來代替)。
2.不支持一些關鍵字。
3.在做一些統計查詢的時候也更加困難,那時候可能需要引入搜索引擎ES了。
4.之前以為sharding-jdbc不支持分頁操作,那天測試了下竟然可以。