分表的概念還是比較好理解的,就拿本網站的評論表展開講講,源於數據量較大,當評論表有CURD操作時,單張表表現的可能有些力不從心,當然這里還能引申出關於讀寫速度的其他好多概念:數據庫讀寫分離,NoSql等等.
垂直拆分:
顧名思義是將表垂直着給拆掉,即:(下面是省略掉字段的一個表)
- +--------+---------+--------+--------+-------+---------+---------+--------+-----+-------------+--------+-----------+------+--------+
- | userid | groupid | areaid | amount | point | modelid | message | islock | vip | overduedate | siteid | connectid | from | mobile |
- +--------+---------+--------+--------+-------+---------+---------+--------+-----+-------------+--------+-----------+------+--------+
- | 1 | 5 | 0 | 0.00 | 50 | 10 | 0 | 0 | 1 | 0 | 1 | | | |
- +--------+---------+--------+--------+-------+---------+---------+--------+-----+-------------+--------+-----------+------+--------+
比如說一個用戶表有很多的屬性,關聯了很多數據,如果放到同一個表里面的話查詢是方便了,但是效率不行,所以這里就是用到了垂直拆表:
拆成如下:
- +--------+---------+--------+--------+-------+---------+---------+
- | userid | groupid | areaid | amount | point | modelid | message |
- +--------+---------+--------+--------+-------+---------+---------+
- | 1 | 5 | 0 | 0.00 | 50 | 10 | 0 |
- +--------+---------+--------+--------+-------+---------+---------+
- 和
- +--------+--------+-----+-------------+--------+-----------+------+--------+
- | userid | islock | vip | overduedate | siteid | connectid | from | mobile |
- +--------+--------+-----+-------------+--------+-----------+------+--------+
- | 1 | 0 | 1 | 0 | 1 | | | |
- +--------+--------+-----+-------------+--------+-----------+------+--------+
- 把常用的字段放一個表,不常用的放一個表
- 把字段比較大的比如text的字段拆出來放一個表里面
- 使用的話是根據具體業務來拆,查詢時使用多表聯查,可以再配合redis存儲
顧名思義是將表數據水平的拆掉,即:
當然這里不一定要0-9一共10張表來表示,通常情況下使用"取模"的形式來將數據進行表的存儲,如果用4張表那么就是id%4 結果會是0,1,2,3四種,user_0,user_1,user_2,user_3就夠了,具體這里就要看表的數據量了.
- 表0 user_0
- +--------+---------+--------+--------+-------+---------+---------+
- | userid | groupid | areaid | amount | point | modelid | message |
- +--------+---------+--------+--------+-------+---------+---------+
- | 1 | 5 | 0 | 0.00 | 50 | 10 | 0 |
- +--------+---------+--------+--------+-------+---------+---------+
- 表1 user_1
- +--------+---------+--------+--------+-------+---------+---------+
- | userid | groupid | areaid | amount | point | modelid | message |
- +--------+---------+--------+--------+-------+---------+---------+
- | 1 | 5 | 0 | 0.00 | 50 | 10 | 0 |
- +--------+---------+--------+--------+-------+---------+---------+
- 表2 user_2
- +--------+---------+--------+--------+-------+---------+---------+
- | userid | groupid | areaid | amount | point | modelid | message |
- +--------+---------+--------+--------+-------+---------+---------+
- | 1 | 5 | 0 | 0.00 | 50 | 10 | 0 |
- +--------+---------+--------+--------+-------+---------+---------+
- .
- .
- .
- 表9 user_9
- +--------+---------+--------+--------+-------+---------+---------+
- | userid | groupid | areaid | amount | point | modelid | message |
- +--------+---------+--------+--------+-------+---------+---------+
- | 1 | 5 | 0 | 0.00 | 50 | 10 | 0 |
- +--------+---------+--------+--------+-------+---------+---------+
對水平分表的數據進行CURD操作也是一樣,之前根據id取模算出當前數據在哪張表中,然后再select * from user_"取的模",這里有人要問了,我添加數據之前都不知道數據庫的id,更不能進行取模了,怎么找到對應的表添加啊,對了,這里就需要一張臨時表,臨時表的作用就是提供數據插入的自增id,得到自增id后再通過取模進行分表插入.
水平分表的表結構是一樣的,只是去掉了自增的屬性.
這里不得不說水平分表的另一種形式,就是不是通過取模計算的分表,而是user_0存數10w條數據,存滿創建新表user_1,繼續存儲在user_1,存滿創建user_2一直存儲並新建下去,個人建議這種分表使用場景是user_1的數據為歷史數據,訪問需求量會慢慢減小,而新表的數據訪問量是很高的.
在這里我想說的就是:"根據業務需求進行分表,不為業務服務的架構都是耍流氓".
原文博客鏈接:http://www.yigangwu.com/index.php?m=content&c=index&a=show&catid=33&id=59 點擊打開鏈接
