通過 前一篇文章 的介紹我們隊等待隊列有了一個比較具體的認識,本文將來分析一下等待隊列是如何睡眠一個進程和如何喚醒一個進程的。
使用等待隊列前通常先定義一個等待隊列頭:static wait_queue_head_t wq ,然后調用wait_event_*函數將等待某條件condition的當前進程插入到等待隊列wq中並睡眠,一直等到condition條件滿足后,內核再將睡眠在等待隊列wq上的某一進程或所有進程喚醒。
這里我們來分析一下喚醒的過程,舉比較常用的wait_event_interruptible來分析:
/** * wait_event_interruptible - sleep until a condition gets true * @wq: the waitqueue to wait on * @condition: a C expression for the event to wait for * * The process is put to sleep (TASK_INTERRUPTIBLE) until the * @condition evaluates to true or a signal is received. * The @condition is checked each time the waitqueue @wq is woken up. * * wake_up() has to be called after changing any variable that could * change the result of the wait condition. * * The function will return -ERESTARTSYS if it was interrupted by a * signal and 0 if @condition evaluated to true. */ #define wait_event_interruptible(wq, condition) \ ({ \ int __ret = 0; \ if (!(condition)) \ __wait_event_interruptible(wq, condition, __ret); \ __ret; \ })
這里很簡單,判斷一下condition條件是否滿足,如果不滿足則調用__wait_event_interruptible函數。
#define __wait_event_interruptible_timeout(wq, condition, ret) \ do { \ DEFINE_WAIT(__wait); \ \ for (;;) { \ prepare_to_wait(&wq, &__wait, TASK_INTERRUPTIBLE); \ if (condition) \ break; \ if (!signal_pending(current)) { \ ret = schedule_timeout(ret); \ if (!ret) \ break; \ continue; \ } \ ret = -ERESTARTSYS; \ break; \ } \ finish_wait(&wq, &__wait); \ } while (0)
__wait_event_interruptible首先定義了一個wait_queue_t類型的等待隊列項__wait:
#define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function) #define DEFINE_WAIT_FUNC(name, function) \ wait_queue_t name = { \ .private = current, \ .func = function, \ .task_list = LIST_HEAD_INIT((name).task_list), \ }
可以發現,這里__wait的private成員(通常用來存放進程的描述符)已經被初始化為current, 表示該等待隊列項對應為當前進程。func成員為該等待隊列項對應的喚醒函數,該進程被喚醒后會執行它,已經被初始化為默認的autoremove_wake_function函數。
然后在一個for (;;) 循環內調用prepare_to_wait函數:
void prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state) { unsigned long flags; wait->flags &= ~WQ_FLAG_EXCLUSIVE; spin_lock_irqsave(&q->lock, flags); if (list_empty(&wait->task_list)) __add_wait_queue(q, wait); set_current_state(state); spin_unlock_irqrestore(&q->lock, flags); }
prepare_to_wait做如下兩件事,將先前定義的等待隊列項__wait插入到等待隊列頭wq,然后將當前進程設為TASK_INTERRUPTIBLE狀態。prepare_to_wait執行完后立馬再檢查一下condition有沒有滿足,如果此時碰巧滿足了則不必要在睡眠了。如果還沒有滿足,則准備睡眠。
睡眠是通過調用schedule()函數實現的,由於之前已經將當前進程設置為TASK_INTERRUPTIBLE狀態,因而這里再執行schedule()進行進程切換的話,之后就永遠不會再調度到該進程運行的,直到該進程被喚醒(即更改為TASK_RUNNING狀態)。
這里在執行schedule()切換進程前會先判斷一下有沒signal過來,如果有則立即返回ERESTARTSYS。沒有的話則執行schedule()睡眠去了。
for (;;) 循環的作用是讓進程被喚醒后再一次去檢查一下condition是否滿足。主要是為了防止等待隊列上的多個進程被同時喚醒后有可能其他進程已經搶先把資源占有過去造成資源又變為不可用,因此最好再判斷一下。(當然,內核也提供了僅喚醒一個或多個進程(獨占等待進程)的方式,有興趣的可以參考相關資料)
進程被喚醒后最后一步是調用finish_wait(&wq, &__wait)函數進行清理工作。finish_wait將進程的狀態再次設為TASK_RUNNING並從等待隊列中刪除該進程。
void finish_wait(wait_queue_head_t *q, wait_queue_t *wait) { unsigned long flags; __set_current_state(TASK_RUNNING); if (!list_empty_careful(&wait->task_list)) { spin_lock_irqsave(&q->lock, flags); list_del_init(&wait->task_list); spin_unlock_irqrestore(&q->lock, flags); } }
再往后就是返回你先前調用wait_event_interruptible(wq, condition)被阻塞的地方繼續往下執行。
3. 等待隊列的喚醒過程
直到這里我們明白等待隊列是如何睡眠的,下面我們分析等待隊列的喚醒過程。
使用等待隊列有個前提,必須得有人喚醒它,如果沒人喚醒它,那么同眠在該等待隊列上的所有進程豈不是變成“僵屍進程”了。
對於設備驅動來講,通常是在中斷處理函數內喚醒該設備的等待隊列。驅動程序通常會提供一組自己的讀寫等待隊列以實現上層(user level)所需的BLOCK和O_NONBLOCK操作。當設備資源可用時,如果驅動發現有進程睡眠在自己的讀寫等待隊列上便會喚醒該等待隊列。
喚醒一個等待隊列是通過wake_up_*函數實現的。這里我們舉對應的wake_up_interruptible作為例子分析。定義如下:
#define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
這里的參數x即要喚醒的等待隊列對應的等待隊列頭。喚醒TASK_INTERRUPTIBLE類型的進程並且默認喚醒該隊列上所有非獨占等待進程和一個獨占等待進程。
__wake_up定義如下:
/** * __wake_up - wake up threads blocked on a waitqueue. * @q: the waitqueue * @mode: which threads * @nr_exclusive: how many wake-one or wake-many threads to wake up * @key: is directly passed to the wakeup function */ void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr_exclusive, void *key) { unsigned long flags; spin_lock_irqsave(&q->lock, flags); __wake_up_common(q, mode, nr_exclusive, 0, key); spin_unlock_irqrestore(&q->lock, flags); } __wake_up 簡單的調用__wake_up_common進行實際喚醒工作。 __wake_up_common定義如下: /* * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just * wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve * number) then we wake all the non-exclusive tasks and one exclusive task. * * There are circumstances in which we can try to wake a task which has already * started to run but is not in state TASK_RUNNING. try_to_wake_up() returns * zero in this (rare) case, and we handle it by continuing to scan the queue. */ static void __wake_up_common(wait_queue_head_t *q, unsigned int mode, int nr_exclusive, int wake_flags, void *key) { wait_queue_t *curr, *next; list_for_each_entry_safe(curr, next, &q->task_list, task_list) { unsigned flags = curr->flags; if (curr->func(curr, mode, wake_flags, key) && (flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive) break; } }
__wake_up_common循環遍歷等待隊列內的所有元素,分別執行其對應的喚醒函數。這里的喚醒函數即先前定義等待隊列項DEFINE_WAIT(__wait)時默認初始化的autoremove_wake_function函數。autoremove_wake_function最終會調用try_to_wake_up函數將進程置為TASK_RUNNING狀態。這樣后面的進程調度便會調度到該進程,從而喚醒該進程繼續執行。