MySQL實現嵌套集合模型
譯文主要是介紹如何用MySQL來存儲嵌套集合數據。在其中會增加一些自己的理解,也會刪除掉一些自認為無用的廢話。
這篇文章主要講的是嵌套集合模型,所以鄰接表不是本文的重點,簡單略過就好。
也許這是原文地址,因為我也不知道這是不是原文。
介紹
什么是分層數據?
類似於樹形結構,除了根節點和葉子節點外,所有節點都有用一個父節點和多個子節點。
那么,在MySQL中如何處理分層數據呢?
原文中介紹了兩種分層結構模型:鄰接表模型
和嵌套集合模型
。
鄰接表模型(The Adjacency List Model)
首先,建立測試表,導入測試數據,
CREATE TABLE category(
category_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
parent INT DEFAULT NULL
);
INSERT INTO category VALUES
(1,'ELECTRONICS',NULL),
(2,'TELEVISIONS',1),
(3,'TUBE',2),
(4,'LCD',2),
(5,'PLASMA',2),
(6,'PORTABLE ELECTRONICS',1),
(7,'MP3 PLAYERS',6),
(8,'FLASH',7),
(9,'CD PLAYERS',6),
(10,'2 WAY RADIOS',6);
SELECT * FROM category ORDER BY category_id;
+-------------+----------------------+--------+
| category_id | name | parent |
+-------------+----------------------+--------+
| 1 | ELECTRONICS | NULL |
| 2 | TELEVISIONS | 1 |
| 3 | TUBE | 2 |
| 4 | LCD | 2 |
| 5 | PLASMA | 2 |
| 6 | PORTABLE ELECTRONICS | 1 |
| 7 | MP3 PLAYERS | 6 |
| 8 | FLASH | 7 |
| 9 | CD PLAYERS | 6 |
| 10 | 2 WAY RADIOS | 6 |
+-------------+----------------------+--------+
10 rows in set (0.00 sec)
在鄰接表中,所有的數據均擁有一個Parent字段,用來存儲它的父節點。當前節點為根節點的話,它的父節點則為NULL。
那么在遍歷的時候,可以使用遞歸來實現查詢整棵樹,從根節點開始,不斷尋找子節點(父節點->子節點->父節點->子節點)。
檢索分層路徑
一般需要獲取一個分層結構的路徑問題,那么
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS';
+-------------+----------------------+--------------+-------+
| lev1 | lev2 | lev3 | lev4 |
+-------------+----------------------+--------------+-------+
| ELECTRONICS | TELEVISIONS | TUBE | NULL |
| ELECTRONICS | TELEVISIONS | LCD | NULL |
| ELECTRONICS | TELEVISIONS | PLASMA | NULL |
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH |
| ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS | NULL |
| ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL |
+-------------+----------------------+--------------+-------+
6 rows in set (0.00 sec)
檢索葉子節點
SELECT t1.name FROM
category AS t1 LEFT JOIN category as t2
ON t1.category_id = t2.parent
WHERE t2.category_id IS NULL;
+--------------+
| name |
+--------------+
| TUBE |
| LCD |
| PLASMA |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+--------------+
檢索指定路徑
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS' AND t4.name = 'FLASH';
+-------------+----------------------+-------------+-------+
| lev1 | lev2 | lev3 | lev4 |
+-------------+----------------------+-------------+-------+
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH |
+-------------+----------------------+-------------+-------+
1 row in set (0.01 sec)
鄰接表的缺點
在檢索路徑的過程中,除了本層外,每一層都會對應一個LEFT JOIN
,那么如果層數不定怎么辦?或者層數過多?
在刪除中間層的節點時,需要同時刪除該節點下的所有節點,否則會出現孤立節點。
嵌套集合模型Nested Set Model
原文中主要的目的是介紹嵌套集合模型,如下
通過集合的包含關系,嵌套結合模型可以表示分層結構,每一個分層可以用一個Set來表示(一個圈),父節點所在的圈包含所有子節點所在的圈。
為了用MySQL來表示集合關系,需要定義連個字段left
和right
(表示一個集合的范圍)。
CREATE TABLE nested_category (
category_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
lft INT NOT NULL,
rgt INT NOT NULL
);
INSERT INTO nested_category VALUES
(1,'ELECTRONICS',1,20),
(2,'TELEVISIONS',2,9),
(3,'TUBE',3,4),
(4,'LCD',5,6),
(5,'PLASMA',7,8),
(6,'PORTABLE ELECTRONICS',10,19),
(7,'MP3 PLAYERS',11,14),
(8,'FLASH',12,13),
(9,'CD PLAYERS',15,16),
(10,'2 WAY RADIOS',17,18);
SELECT * FROM nested_category ORDER BY category_id;
+-------------+----------------------+-----+-----+
| category_id | name | lft | rgt |
+-------------+----------------------+-----+-----+
| 1 | ELECTRONICS | 1 | 20 |
| 2 | TELEVISIONS | 2 | 9 |
| 3 | TUBE | 3 | 4 |
| 4 | LCD | 5 | 6 |
| 5 | PLASMA | 7 | 8 |
| 6 | PORTABLE ELECTRONICS | 10 | 19 |
| 7 | MP3 PLAYERS | 11 | 14 |
| 8 | FLASH | 12 | 13 |
| 9 | CD PLAYERS | 15 | 16 |
| 10 | 2 WAY RADIOS | 17 | 18 |
+-------------+----------------------+-----+-----+
由於left
和right
是MySQL的保留字,因此,字段名稱用lft和rgt代替。每一個集合都是從lft開始到rgt結束,也就是集合的兩個邊界。
在樹中也同樣適用,
當為樹狀結構編號時,我們從左到右,一次一層,賦值按照從左到右的順序遍歷其子節點,這種方法稱為先序遍歷算法
。
檢索分層路徑
由於子節點的lft值總在父節點的lft和rgt值之間,所以可以通過父節點連接到子節點上來檢索整棵樹。
SELECT node.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = 'ELECTRONICS'
ORDER BY node.lft;
+----------------------+
| name |
+----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+----------------------+</pre>
這個方法並不需要考慮層數,而且不需要考慮節點的rgt。
檢索所有葉子節點
由於每一個葉子節點的rgt=lft+1
,那么只需要這一個條件即可。
SELECT name
FROM nested_category
WHERE rgt = lft + 1;
+--------------+
| name |
+--------------+
| TUBE |
| LCD |
| PLASMA |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+--------------+
檢索節點路徑
不再需要多個join連接操作。
SELECT parent.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'FLASH'
ORDER BY node.lft;
+----------------------+
| name |
+----------------------+
| ELECTRONICS |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
+----------------------+
檢索節點深度
通過COUNT
和GROUP BY
函數來獲取父節點的個數。
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;
+----------------------+-------+
| name | depth |
+----------------------+-------+
| ELECTRONICS | 0 |
| TELEVISIONS | 1 |
| TUBE | 2 |
| LCD | 2 |
| PLASMA | 2 |
| PORTABLE ELECTRONICS | 1 |
| MP3 PLAYERS | 2 |
| FLASH | 3 |
| CD PLAYERS | 2 |
| 2 WAY RADIOS | 2 |
+----------------------+-------+
甚至可以得到分層的縮進結果,
SELECT CONCAT( REPEAT(' ', COUNT(parent.name) - 1), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;
+-----------------------+
| name |
+-----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+-----------------------+
檢索子樹的深度
考慮到檢索中需要自連接的node或parent,因此需要增加一個額外的連接來作為子查詢來限制子樹。
SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
nested_category AS parent,
nested_category AS sub_parent,
(
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'PORTABLE ELECTRONICS'
GROUP BY node.name
ORDER BY node.lft
)AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
AND sub_parent.name = sub_tree.name
GROUP BY node.name
ORDER BY node.lft;
+----------------------+-------+
| name | depth |
+----------------------+-------+
| PORTABLE ELECTRONICS | 0 |
| MP3 PLAYERS | 1 |
| FLASH | 2 |
| CD PLAYERS | 1 |
| 2 WAY RADIOS | 1 |
+----------------------+-------+
檢索節點的直接子節點
假設一個場景,當用戶點擊網站上電子產品的一個分類時,將呈現該分類下的產品,同時需要列出所有子分類,並不是全部分類。
為了限制顯示分類的層數,需要使用HAVING
字句,
SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
nested_category AS parent,
nested_category AS sub_parent,
(
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'PORTABLE ELECTRONICS'
GROUP BY node.name
ORDER BY node.lft
)AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.lft;
+----------------------+-------+
| name | depth |
+----------------------+-------+
| PORTABLE ELECTRONICS | 0 |
| MP3 PLAYERS | 1 |
| CD PLAYERS | 1 |
| 2 WAY RADIOS | 1 |
+----------------------+-------+
增加新節點
上面已經介紹了如何檢索結果,那么如何才能增加新的節點呢?
如果希望在TELEVISIONS和PROTABLE ELECTRONICS節點之間增加一個新的節點,那么新節點的lft和rgt的值應該是10和11,那么所有大於10的節點(新節點右側的節點)的lft和rgt都應該加2,如上圖所示。
LOCK TABLE nested_category WRITE;
SELECT @myRight := rgt FROM nested_category
WHERE name = 'TELEVISIONS';
UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight;
INSERT INTO nested_category(name, lft, rgt) VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2);
UNLOCK TABLES
如果希望在葉子節點下增加節點,需要修改下查詢語句,
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft FROM nested_category
WHERE name = '2 WAY RADIOS';
UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft;
INSERT INTO nested_category(name, lft, rgt) VALUES('FRS', @myLeft + 1, @myLeft + 2);
UNLOCK TABLES;```
###刪除節點
刪除葉子節點比較容易,只需要刪除自己,而刪除一個中間層節點就需要刪除其所有子節點。在這個模型中,所有子節點的節點正好在lft和rgt之間。
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'GAME CONSOLES';
DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;
UNLOCK TABLES;
在某些情況下,只需要刪除某個節點,但是並不希望刪除該節點下的子節點數據。
通過把右側所有節點的左右值-2,當前節點的子節點左右值-1
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'PORTABLE ELECTRONICS';
DELETE FROM nested_category WHERE lft = @myLeft;
UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight;
UNLOCK TABLES;
##最后的思考
原作者推薦了一本名為《Joe Celko's Trees and Hierarchies in SQL for Smarties》的書籍,該書的作者是SQL領域的大神Joe Celko(嵌套幾何模型的創造者)。這本書涵蓋了本文中未涉及到的一些高級話題。