socket里面那個又愛又恨的鎖

本文轉載自查看原文 2020-08-27 22:32 622 linux tcp/ip/ programming c linux

查一個問題：結果看了一下軟中斷以及系統所耗cpu，心中滿是傷痕啊-------

perf 結果一眼可以看到:主要是鎖

那么這個lock 是用來干什么的呢？？

A:TCP socket的使用者有兩種：進程（線程）和軟中斷。同一時間可能會有兩個進程（線程），或位於不同CPU的兩個軟中斷，或進程（線程）與軟中斷訪問同一個socket。所以為了使socket在同一時刻只能被一個使用者訪問，那么互斥機制是如何實現的呢？----是使用鎖完成的，也就是這個鎖lock sock

struct sock {
...
    socket_lock_t        sk_lock;
...
}
/* This is the per-socket lock.  The spinlock provides a synchronization
 * between user contexts and software interrupt processing, whereas the
 * mini-semaphore synchronizes multiple users amongst themselves.
 */
typedef struct {
    spinlock_t        slock;//該自旋鎖是用於同步進程上下文和軟中斷上下文的關鍵；
    int            owned;//取值為1表示該傳輸控制塊已經被進程上下文鎖定，取值為0表示沒有被進程上下文鎖定；
    wait_queue_head_t    wq;//wq：等待隊列，當進程上下文需要持有該傳輸控制塊，但是其當前又被軟中斷鎖定時，進程會等待
    /*
     * We express the mutex-alike socket_lock semantics
     * to the lock validator by explicitly managing
     * the slock as a lock variant (in addition to
     * the slock itself):
     */
#ifdef CONFIG_DEBUG_LOCK_ALLOC
    struct lockdep_map dep_map;
#endif
} socket_lock_t;

進程上下文的訪問操作

進程上下文在訪問該傳輸控制塊之前需要調用lock_sock()鎖定，在訪問完成后調用release_sock()將其釋放

//__lock_sock()將進程掛到sk->sk_lock中的等待隊列wq上，直到沒有進程再持有該該傳輸
    //控制塊時返回。注意：調用時已經持有sk->sk_lock，睡眠之前釋放鎖，返回前再次持有鎖
    static void __lock_sock(struct sock *sk)
    {
        //定義一個等待隊列結點
        DEFINE_WAIT(wait);
    
        //循環，直到sock_owned_by_user()返回0才結束
        for (;;) {
            //將調用進程掛接到鎖的等待隊列中
            prepare_to_wait_exclusive(&sk->sk_lock.wq, &wait,
                        TASK_UNINTERRUPTIBLE);
            //釋放鎖並打開下半部
            spin_unlock_bh(&sk->sk_lock.slock);
            //執行一次調度
            schedule();
            //再次被調度到時會回到這里，首先持鎖並關閉下半部
            spin_lock_bh(&sk->sk_lock.slock);
            //如果沒有進程再次持有該傳輸控制塊，那么返回
            if (!sock_owned_by_user(sk))
                break;
        }
        finish_wait(&sk->sk_lock.wq, &wait);
    }

void lock_sock_nested(struct sock *sk, int subclass)
{
    might_sleep();//調用lock_sock()可能會導致休眠---------注意
    spin_lock_bh(&sk->sk_lock.slock);//持有自旋鎖並關閉下半部
    //如果owned不為0，說明有進程持有該傳輸控制塊，調用__lock_sock()等待，掛在等待隊列上休眠
    if (sk->sk_lock.owned)
        __lock_sock(sk);
    //上面__lock_sock()返回后現場已經被還原，即持有鎖並且已經關閉下半部。
    //將owned設置為1，表示本進程現在持有該傳輸控制塊
    sk->sk_lock.owned = 1;
    //釋放鎖但是沒有開啟下半部-----還是關閉了 軟中斷
    spin_unlock(&sk->sk_lock.slock);
    /*
     * The sk_lock has mutex_lock() semantics here:------------這是干啥？

We express the mutex-alike socket_lock semanticsto the lock validator by explicitly managingthe slock as a lock variant 
(in addition tothe slock itself): ------不懂*/

 mutex_acquire(&sk->sk_lock.dep_map, subclass, 0, _RET_IP_); local_bh_enable();//開啟下半部 軟中斷 }

owned為1之后不再持有自旋鎖，也已經開啟軟中斷。-----作用是協議棧的處理並非立刻就能結束，如果只是簡單的在開始起持有自旋鎖並關閉下半部，在處理結束時釋放自旋鎖並打開下半部，會降低系統性能，同時長時間關閉軟中斷，還可能使得網卡接收軟中斷得不到及時調用，導致丟包

release_sock()

進程上下文在結束傳輸控制塊的操作之后，需要調用release_sock()釋放傳輸控制塊。釋放的核心是將owned設置為0並通知其它等待該傳輸控制塊的進程

void release_sock(struct sock *sk)
{
    /*
     * The sk_lock has mutex_unlock() semantics:
     */
    //調試相關，忽略
    mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);

    //獲取自旋鎖並關閉下半部
    spin_lock_bh(&sk->sk_lock.slock);
    //如果后備隊列不為空，則調用__release_sock()處理后備隊列中的數據包，見數據包的接收過程
    if (sk->sk_backlog.tail)
        __release_sock(sk);
    //設置owned為0，表示調用者不再持有該傳輸控制塊
    sk->sk_lock.owned = 0;
    //如果等待隊列不為空，則喚醒這些等待的進程
    if (waitqueue_active(&sk->sk_lock.wq))
        wake_up(&sk->sk_lock.wq);
    //釋放自旋鎖並開啟下半部
    spin_unlock_bh(&sk->sk_lock.slock);
}

（1）軟中斷先訪問進程后訪問

這時軟中斷已經獲取了自旋鎖，進程在獲取自旋鎖時會等待，軟中斷釋放鎖時進程才能成功獲取鎖。

（2）進程先訪問軟中斷后訪問

進程獲取自旋鎖（關軟中斷，防止被軟中斷打斷）時會將sk->sk_lock.owned設置為1后釋放自旋鎖並開啟軟中斷，然后執行對socket的訪問。這時如果軟中斷發生，則進程的執行被中止，然后軟中斷中將數據放到接收后備隊列中

int tcp_v4_rcv(struct sk_buff *skb)
{
...
process:
...
    //獲取sk->sk_lock.slock自旋鎖
    bh_lock_sock_nested(sk);
    //如果沒有進程鎖定該傳輸控制塊，將數據接收到奧prequeue或者receive_queue中
    if (!sock_owned_by_user(sk)) {
        if (!tcp_prequeue(sk, skb))
            ret = tcp_v4_do_rcv(sk, skb);
    } else
        //如果進程已經鎖定該傳輸控制塊，那么先將數據接收到后備隊列中----趕緊退出 讓進程處理 然后在release的時候 處理后備隊列
        sk_add_backlog(sk, skb);
    //釋放自旋鎖
    bh_unlock_sock(sk);
...

/* BH context may only use the following locking interface. */
#define bh_lock_sock(__sk)    spin_lock(&((__sk)->sk_lock.slock))
#define bh_lock_sock_nested(__sk) \
                spin_lock_nested(&((__sk)->sk_lock.slock), \
                SINGLE_DEPTH_NESTING)
#define bh_unlock_sock(__sk)    spin_unlock(&((__sk)->sk_lock.slock))

所以這個鎖貌似規避不了，那么怎么處理呢？？？

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 程序員與新技術之間的「愛」與「恨」網絡驗證碼--你到底是愛它還是恨它？ socket_listen里面第二個參數backlog的用處哪些 Python 庫讓你相見恨晚？在PL/SQL DEV里面有把鎖一樣的按鈕，點擊它為什么會跳出“these query result are not updateable，include the ROWID to get updateable results” 愛剪輯如何給視頻去水印我愛java系列---【待定】程序員的愛推薦一些相見恨晚的 Python 庫「一」 Python庫，讓你相見恨晚的第三方庫