linux源碼解讀（十四）：紅黑樹在內核的應用——紅黑樹原理和api解析

本文轉載自查看原文 2022-01-13 21:42 1471 操作系統原理

　 1、紅黑樹是一種非常重要的數據結構，有比較明顯的兩個特點：

插入、刪除、查找的時間復雜度接近O(logN)，N是節點個數，明顯比鏈表快；是一種性能非常穩定的二叉樹！
中序遍歷的結果是從小到大排好序的

　　基於以上兩個特點，紅黑樹比較適合的應用場景:

需要動態插入、刪除、查找的場景，包括但不限於：
- 　　某些數據庫的增刪改查，比如select * from xxx where 這類條件檢索
- linux內核中進程通過紅黑樹組織管理，便於快速插入、刪除、查找進程的task_struct
- linux內存中內存的管理：分配和回收。用紅黑樹組織已經分配的內存塊，當應用程序調用free釋放內存的時候，可以根據內存地址在紅黑樹中快速找到目標內存塊
- hashmap中(key,value)增、刪、改查的實現；java 8就采用了RBTree替代鏈表
- Ext3文件系統，通過紅黑樹組織目錄項
排好序的場景，比如：
- 　　linux定時器的實現：hrtimer以紅黑樹的形式組織，樹的最左邊的節點就是最快到期的定時

　　從上述的應用場景可以看出來紅黑樹是非常受歡迎的一種數據結構，接下來深入分析一些典型的場景，看看linux的內核具體是怎么使用紅黑樹的！

2、先來看看紅黑樹的定義，在include\linux\rbtree.h文件中：

struct rb_node {
    unsigned long  __rb_parent_color;
    struct rb_node *rb_right;
    struct rb_node *rb_left;
} __attribute__((aligned(sizeof(long))));
    /* The alignment might seem pointless, but allegedly CRIS needs it */

　　結構體非常簡單，只有3個字段，凡是有一丁點開發經驗的人員都會有疑問：紅黑樹有那么多應用場景，這個結構體居然一個應用場景的業務字段都沒有，感覺就像個還沒裝修的毛坯房，這個該怎么用了？這恰恰是設計的精妙之處：紅黑樹在linux內核有大量的應用場景，如果把rb_node的定義加上了特定應用場景的業務字段，那這個結構體就只能在這個特定的場景下用了，完全沒有了普適性，變成了場景緊耦合的；這樣的結構體多了會增加后續代碼維護的難度，所以rb_node結構體的定義就極簡了，只保留了紅黑樹節點自身的3個屬性：左孩子、右孩子、節點顏色（list_head結構體也是這個思路）；這么簡單、不帶業務場景屬性的結構體該怎么用了？先舉個簡單的例子，看懂后能更快地理解linux源碼的原理。比如一個班級有50個學生，每個學生有id、name和score分數，現在要用紅黑樹組織所有的學生，先定義一個student的結構體：

struct Student{
    int id;
    char *name;
    int scroe
    struct rb_node s_rb;
};

　　前面3個都是業務字段，第4個是紅黑樹的字段（student和rb_node結構體看起來是兩個分開的結構體，但經過編譯器編譯后會合並字段，最終就是一塊連續的內存，有點類似c++的繼承關系）；linux提供了紅黑樹基本的增、刪、改、查、左旋、右旋、設置顏色等操作，如下：

#define rb_parent(r)   ((struct rb_node *)((r)->rb_parent_color & ~3)) //低兩位清0
#define rb_color(r)   ((r)->rb_parent_color & 1)                       //取最后一位
#define rb_is_red(r)   (!rb_color(r))                                  //最后一位為0？
#define rb_is_black(r) rb_color(r)                                     //最后一位為1？
#define rb_set_red(r)  do { (r)->rb_parent_color &= ~1; } while (0)    //最后一位置0
#define rb_set_black(r)  do { (r)->rb_parent_color |= 1; } while (0)   //最后一位置1

static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p) //設置父親
{
    rb->rb_parent_color = (rb->rb_parent_color & 3) | (unsigned long)p;
}
static inline void rb_set_color(struct rb_node *rb, int color)          //設置顏色
{
    rb->rb_parent_color = (rb->rb_parent_color & ~1) | color;
}
//左旋、右旋
void __rb_rotate_left(struct rb_node *node, struct rb_root *root);
void __rb_rotate_right(struct rb_node *node, struct rb_root *root);
//刪除節點
void rb_erase(struct rb_node *, struct rb_root *);
void __rb_erase_color(struct rb_node *node, struct rb_node *parent, struct rb_root *root);
//替換節點
void rb_replace_node(struct rb_node *old, struct rb_node *new, struct rb_root *tree);
//插入節點

void rb_link_node(struct rb_node * node, struct rb_node * parent, struct rb_node ** rb_link);

//遍歷紅黑樹
extern struct rb_node *rb_next(const struct rb_node *); //后繼
extern struct rb_node *rb_prev(const struct rb_node *); //前驅
extern struct rb_node *rb_first(const struct rb_root *);//最小值
extern struct rb_node *rb_last(const struct rb_root *); //最大值

　　上面的操作接口傳入的參數都是rb_node，怎么才能用於來操作用戶自定義業務場景的紅黑樹了，就比如上面的student結構體？既然這些接口的傳入參數都是rb_node，如果不改參數和函數實現，就只能按照別人的要求傳入rb_node參數，自定義結構體的字段怎么才能“順帶”加入紅黑樹了？這個也簡單，自己生成結構體，然后把結構體的rb_node參數傳入即可，如下：

/*
 將對象加到紅黑樹上
 s_root            紅黑樹root節點
 ptr_stu        對象指針
 rb_link        對象節點所在的節點
 rb_parent        父節點
 */
void student_link_rb(struct rb_root *s_root, struct Student *ptr_stu,
        struct rb_node **rb_link, struct rb_node *rb_parent)
{
    rb_link_node(&ptr_stu->s_rb, rb_parent, rb_link);
    rb_insert_color(&ptr_stu->s_rb, s_root);
}

void add_student(struct rb_root *s_root, struct Student *stu, struct Student **stu_header)
{
    struct rb_node **rb_link, *rb_parent;
    // 插入紅黑樹
    student_link_rb(s_root, stu, rb_link, rb_parent);
}

　　假如以score分數作為構建紅黑樹的key，構建的樹如下：每個student節點的rb_right和rb_left指針指向的都是rb_node的起始地址，也就是_rb_parent_color的值，但是score、name、id這些值其實才是業務上急需讀寫的，怎么得到這些字段的值了?

linux的開發人員早就想好了讀取的方法：先得到student實例的開始地址，再通過偏移讀字段不就行了么？如下：

#define container_of(ptr, type, member) ({                \
    const typeof( ((type *)0)->member ) *__mptr = (ptr);  \
    (type *)( (char *)__mptr - offsetof(type,member) );})

　　通過上面的宏定義就能得到student實例的首地址了，用法如下：調用container_of方法，傳入rbnode的實例（確認student實例的位置）、student結構體和內部rb_node的位置（用以計算rb_node在結構體內部的偏移，然后反推student實例的首地址）：得到student實例的首地址，接下來就可以愉快的直接使用id、name、score等字段了；

struct Student* find_by_id(struct rb_root *root, int id)
{
    struct Student *ptr_stu = NULL;
    struct rb_node *rbnode = root->rb_node;
    while (NULL != rbnode)
    {
//最核心的代碼：三個參數分別時rb_node的實例，student結構體的定義和內部的rb_node字段位置
        struct Student *ptr_tmp = container_of(rbnode, struct Student, s_rb);
        if (id < ptr_tmp->id)
        {
            rbnode = rbnode->rb_left;
        }
        else if (id > ptr_tmp->id)
        {
            rbnode = rbnode->rb_right;
        }
        else
        {
            ptr_stu = ptr_tmp;
            break;
        }
    }
    return ptr_stu;
}

　　總結一下紅黑樹使用的大致流程：

開發人員根據業務場景需求定義結構體的字段，務必包含rb_node；
生成結構體的實例stu，調用rb_link_node添加節點構建紅黑樹。當然傳入的參數是stu->s_rb
遍歷查找的時候根據找s_rb實例、自定義結構體、rb_node在結構體的名稱得到自定義結構體實例的首地址，然后就能愉快的讀寫業務字段了！

3、上述的案例夠簡單吧，linux內部各種復雜場景使用紅黑樹的原理和這個一毛一樣，沒有任何本質區別！理解了上述案例的原理，也就理解了linux內核使用紅黑樹的原理！接下來看看紅黑樹一些關機api實現的方法了：

（1）紅黑樹是排好序的，中序遍歷的結果就是從小到大排列的；最左邊就是整棵樹的最小節點，所以一直向左就能找到第一個、也是最小的節點；

/*
 * This function returns the first node (in sort order) of the tree.
 */
struct rb_node *rb_first(const struct rb_root *root)
{
    struct rb_node    *n;

    n = root->rb_node;
    if (!n)
        return NULL;
    while (n->rb_left)
        n = n->rb_left;
    return n;
}

　　同理：一路向右能找到整棵樹最大的節點

struct rb_node *rb_last(const struct rb_root *root)
{
    struct rb_node    *n;

    n = root->rb_node;
    if (!n)
        return NULL;
    while (n->rb_right)
        n = n->rb_right;
    return n;
}

　　（2）找到某個節點下一個節點：比如A節點數值是50，從A節點的右孩開始（右孩所有節點都比A大），往左找 as far as get null；也就是整個樹中比A大的最小節點；這個功能可以用來做條件查詢！

struct rb_node *rb_next(const struct rb_node *node)
{
    struct rb_node *parent;

    if (RB_EMPTY_NODE(node))
        return NULL;

    /*
     * If we have a right-hand child, go down and then left as far
     * as we can.
     */
    if (node->rb_right) {
        node = node->rb_right;
        while (node->rb_left)
            node=node->rb_left;
        return (struct rb_node *)node;
    }

    /*
     * No right-hand children. Everything down and left is smaller than us,
     * so any 'next' node must be in the general direction of our parent.
     * Go up the tree; any time the ancestor is a right-hand child of its
     * parent, keep going up. First time it's a left-hand child of its
     * parent, said parent is our 'next' node.
     */
    while ((parent = rb_parent(node)) && node == parent->rb_right)
        node = parent;

    return parent;
}

　　同理，找到整個樹中比A小的最大節點：

struct rb_node *rb_prev(const struct rb_node *node)
{
    struct rb_node *parent;

    if (RB_EMPTY_NODE(node))
        return NULL;

    /*
     * If we have a left-hand child, go down and then right as far
     * as we can.
     */
    if (node->rb_left) {
        node = node->rb_left;
        while (node->rb_right)
            node=node->rb_right;
        return (struct rb_node *)node;
    }

    /*
     * No left-hand children. Go up till we find an ancestor which
     * is a right-hand child of its parent.
     */
    while ((parent = rb_parent(node)) && node == parent->rb_left)
        node = parent;

    return parent;
}

　　（3）替換一個節點：把周圍的指針改向，然后改節點顏色

void rb_replace_node(struct rb_node *victim, struct rb_node *new,
             struct rb_root *root)
{
    struct rb_node *parent = rb_parent(victim);

    /* Set the surrounding nodes to point to the replacement */
    __rb_change_child(victim, new, parent, root);
    if (victim->rb_left)
        rb_set_parent(victim->rb_left, new);
    if (victim->rb_right)
        rb_set_parent(victim->rb_right, new);

    /* Copy the pointers/colour from the victim to the replacement */
    *new = *victim;
}

　　（4）插入一個節點：分不同情況左旋、右旋；

static __always_inline void
__rb_insert(struct rb_node *node, struct rb_root *root,
        void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
{
    struct rb_node *parent = rb_red_parent(node), *gparent, *tmp;

    while (true) {
        /*
         * Loop invariant: node is red
         *
         * If there is a black parent, we are done.
         * Otherwise, take some corrective action as we don't
         * want a red root or two consecutive red nodes.
         */
        if (!parent) {
            rb_set_parent_color(node, NULL, RB_BLACK);
            break;
        } else if (rb_is_black(parent))
            break;

        gparent = rb_red_parent(parent);

        tmp = gparent->rb_right;
        if (parent != tmp) {    /* parent == gparent->rb_left */
            if (tmp && rb_is_red(tmp)) {
                /*
                 * Case 1 - color flips
                 *
                 *       G            g
                 *      / \          / \
                 *     p   u  -->   P   U
                 *    /            /
                 *   n            n
                 *
                 * However, since g's parent might be red, and
                 * 4) does not allow this, we need to recurse
                 * at g.
                 */
                rb_set_parent_color(tmp, gparent, RB_BLACK);
                rb_set_parent_color(parent, gparent, RB_BLACK);
                node = gparent;
                parent = rb_parent(node);
                rb_set_parent_color(node, parent, RB_RED);
                continue;
            }

            tmp = parent->rb_right;
            if (node == tmp) {
                /*
                 * Case 2 - left rotate at parent
                 *
                 *      G             G
                 *     / \           / \
                 *    p   U  -->    n   U
                 *     \           /
                 *      n         p
                 *
                 * This still leaves us in violation of 4), the
                 * continuation into Case 3 will fix that.
                 */
                tmp = node->rb_left;
                WRITE_ONCE(parent->rb_right, tmp);
                WRITE_ONCE(node->rb_left, parent);
                if (tmp)
                    rb_set_parent_color(tmp, parent,
                                RB_BLACK);
                rb_set_parent_color(parent, node, RB_RED);
                augment_rotate(parent, node);
                parent = node;
                tmp = node->rb_right;
            }

            /*
             * Case 3 - right rotate at gparent
             *
             *        G           P
             *       / \         / \
             *      p   U  -->  n   g
             *     /                 \
             *    n                   U
             */
            WRITE_ONCE(gparent->rb_left, tmp); /* == parent->rb_right */
            WRITE_ONCE(parent->rb_right, gparent);
            if (tmp)
                rb_set_parent_color(tmp, gparent, RB_BLACK);
            __rb_rotate_set_parents(gparent, parent, root, RB_RED);
            augment_rotate(gparent, parent);
            break;
        } else {
            tmp = gparent->rb_left;
            if (tmp && rb_is_red(tmp)) {
                /* Case 1 - color flips */
                rb_set_parent_color(tmp, gparent, RB_BLACK);
                rb_set_parent_color(parent, gparent, RB_BLACK);
                node = gparent;
                parent = rb_parent(node);
                rb_set_parent_color(node, parent, RB_RED);
                continue;
            }

            tmp = parent->rb_left;
            if (node == tmp) {
                /* Case 2 - right rotate at parent */
                tmp = node->rb_right;
                WRITE_ONCE(parent->rb_left, tmp);
                WRITE_ONCE(node->rb_right, parent);
                if (tmp)
                    rb_set_parent_color(tmp, parent,
                                RB_BLACK);
                rb_set_parent_color(parent, node, RB_RED);
                augment_rotate(parent, node);
                parent = node;
                tmp = node->rb_left;
            }

            /* Case 3 - left rotate at gparent */
            WRITE_ONCE(gparent->rb_right, tmp); /* == parent->rb_left */
            WRITE_ONCE(parent->rb_left, gparent);
            if (tmp)
                rb_set_parent_color(tmp, gparent, RB_BLACK);
            __rb_rotate_set_parents(gparent, parent, root, RB_RED);
            augment_rotate(gparent, parent);
            break;
        }
    }
}

　　rb_node最牛逼的地方：去掉了業務屬性的字段，和業務場景松耦合，讓rb_node結構體和對應的方法可以做到在不同的業務場景通用；同時配合container_of函數，又能通過rb_node實例地址快速反推出業務結構體實例的首地址，方便讀寫業務屬性的字段，這種做法高！實在是高！

　　4、紅黑樹為什么這么牛？個人認為最核心的要點在於其動態的高度調整！換句話說：在增、刪、改的過程中，為了避免紅黑樹退化成單向鏈表，紅黑樹會動態地調整樹的高度，讓樹高不超過2lg(n+1)；相比AVL 樹，紅黑樹只需維護一個黑高度，效率高很多；這樣一來，增刪改查的時間復雜度就控制在了O(lgn)! 那么紅黑樹又是怎么控制樹高度的了？就是紅黑樹那5條規則（這不是廢話么？）！最核心的就是第4、5點！

（1）先看看第4點：任何相鄰的節點都不能同時為紅色，也就是說：紅節點是被黑節點隔開的；隨意選一條從根節點到葉子節點的路徑，因為要滿足這點，所以每存在一個紅節點，至少對應了一個黑節點，即紅色節點個數<=黑色節點個數；假如黑色節點數量是n，那么整棵樹節點的數量<=2n;

　（2）再看看第5點：每個節點，從該節點到達其可達葉子節點的所有路徑，都包含相同數目的黑色節點；新加入的節點初始顏色是紅色，如果其父節點也是紅色，就需要挨個往上回溯更改每個父節點的顏色了！更改顏色后如果打破了第5點，就需要通過旋轉重構紅黑樹，本質上是降低整棵樹的高度，避免整棵樹退化成鏈表，舉個例子：初始紅黑樹如下：

增加8節點，節點初始是紅色，是7節點的右子節點；因為7節點也是紅色，所以要調整成黑色；但是這樣一來，2->4->6->7就有3個黑節點了，這時需要繼續往上回溯6、4、2節點，分別更改這3個節點的顏色，導致根節點2成了紅色，同時5和6都是紅色，這兩個節點都不符合規定；此時再左旋4節點，讓4來做根節點，降低了樹的高度，后續再增刪改查時還是能保持時間復雜度是O(n)!

參考：

1、https://www.bilibili.com/video/BV135411h7wJ?p=1 紅黑樹介紹

2、https://cloud.tencent.com/developer/article/1922776 數據結構紅黑樹

3、https://blog.csdn.net/weixin_46381158/article/details/117999284 紅黑樹基本用法

4、https://rbtree.phpisfuture.com/ 紅黑樹在線演示

5、https://segmentfault.com/a/1190000023101310 紅黑樹前世今生

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 linux源碼解讀（十八）：紅黑樹在內核的應用——timer定時器 linux源碼解讀（十五）：紅黑樹在內核的應用——CFS調度器紅黑樹(三)之 Linux內核中紅黑樹的經典實現紅黑樹及其在Linux內存管理中的應用詳解紅黑樹原理和算法介紹紅黑樹原理詳解及golang實現 JDK8 HashMap源碼行級解析紅黑樹操作史上最全最詳細圖解 linux rbtree 詳解(紅黑樹) 紅黑樹詳解紅黑樹詳解