pipe_wait問題_轉

本文轉載自查看原文 2017-06-11 21:48 1743 linuxAPP_Process

最近遇到pipe_wait問題，父進程調用子進程時，子進程阻塞，cat /proc/$child/wchan輸出pipe_wait，進程阻塞在pipe_wait不執行，轉載文章對此問題分析很透徹。

問題背景

如果要在Java中調用shell腳本時，可以使用Runtime.exec或ProcessBuilder.start。它們都會返回一個Process對象，通過這個Process可以對獲取腳本執行的輸出，然后在Java中進行相應處理。例如，下面的代碼：

通常，安全編碼規范中都會指出：使用Process.waitfor的時候，可能導致進程阻塞，甚至死鎖。那么這句應該怎么理解呢？用個實際的例子說明下。

問題描述

使用Java代碼調用shell腳本，執行后會發現Java進程和Shell進程都會掛起，無法結束。

Java代碼 processtest.java

[java] view plain copy

try
{
Process process = Runtime.getRuntime().exec(cmd);
System.out.println("start run cmd=" + cmd);
process.waitFor();
System.out.println("finish run cmd=" + cmd);
}
catch (Exception e)
{
e.printStackTrace();
}

被調用的Shell腳本doecho.sh

[plain] view plain copy

#!/bin/bash
for((i=0; ;i++))
do
echo -n "0123456789"
echo $i >> count.log
done

掛起原因

主進程中調用Runtime.exec會創建一個子進程，用於執行shell腳本。子進程創建后會和主進程分別獨立運行。
因為主進程需要等待腳本執行完成，然后對腳本返回值或輸出進行處理，所以這里主進程調用Process.waitfor等待子進程完成。
通過shell腳本可以看出：子進程執行過程就是不斷的打印信息。主進程中可以通過Process.getInputStream和Process.getErrorStream獲取並處理。
這時候子進程不斷向主進程發生數據，而主進程調用Process.waitfor后已掛起。當前子進程和主進程之間的緩沖區塞滿后，子進程不能繼續寫數據，然后也會掛起。
這樣子進程等待主進程讀取數據，主進程等待子進程結束，兩個進程相互等待，最終導致死鎖。

解決方法

基於上述分析，只要主進程在waitfor之前，能不斷處理緩沖區中的數據就可以。因為，我們可以再waitfor之前，單獨啟兩個額外的線程，分別用於處理InputStream和ErrorStream就可以。實例代碼如下：

JDK上的說明

By default, the created subprocess does not have its own terminal or console. All its standard I/O (i.e. stdin, stdout, stderr) operations will be redirected to the parent process, where they can be accessed via the streams obtained using the methods getOutputStream(), getInputStream(), and getErrorStream(). The parent process uses these streams to feed input to and get output from the subprocess. Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the subprocess may cause the subprocess to block, or even deadlock.

從JDK的說明中可以看出兩點：

如果系統中標准輸入輸出流使用的bufffer大小有限，所有讀寫時可能會出現阻塞或死鎖。------這點上面已分析
子進程的標准I/O已經被重定向到了父進程。父進程可以通過對應的接口獲取到子進程的I/O。------I/O是如何重定向的？

背后的故事

要回答上面的問題可以從系統的層面嘗試分析。

首先通過ps命令可以看到，在Linux上多出了兩個進程：一個Java進程、一個shell進程，且shell是java的子進程。

然后，可以看到shell進程的狀態顯示為pipe_w。我剛開始以為pipe_w表示pipe_write。進一步查看/proc/pid/wchan 發現pipe_w其實表示為pipe_wait。通常/proc/pid/wchan表示一個內存地址或進程正在執行的方法名稱。因此，這似乎表明該進程在操作pipe時發生了等待，從而被掛起。我們知道pipe是IPC的一種，通常用於父子進程之間通信。這樣我們可以猜測：可能是父子進程之間通過 pipe通信的時候出現了阻塞。

另外，觀察父子進程的fd信息，即/proc/pid/fd。可以看到子進程的0/1/2（即：stdin/stdout/stderr）分別被重定向到了三個pipe文件；父親進程中對應的也有對着三個pipe文件的引用。

綜上所述，這個過程應該是這樣的：子進程不斷向pipe中寫數據，而父進程一直不讀取pipe中的數據，導致pipe被塞滿，子進程無法繼續寫入，所以出現pipe_wait的狀態。那么pipe到底有多大呢？

測試pipe的大小

因為我已經在doecho.sh的腳步中記錄了打印了字符數，查看count.log就可以知道子進程最終發送了多少數據。在子進程掛起了，count.log的數據一致保持在6543不變。故，當前子進程向pipe中寫入6543*10=65430bytes時，出現進程掛起。 65536-65430=106byte即距離64K差了106bytes。

換另外的測試方式，每次寫入1k，記錄總共可以寫入多少。進程代碼如test_pipe_size.sh所示。測試結果為64K。兩次結果相差了106byte，那個這個pipe到底多大？

Linux上pipe分析

最直接的方式就是看源碼。Pipe的實現代碼主要在linux/fs/pipe.c中，我們主要看pipe_wait方法。

     pipe_read(struct kiocb *iocb, struct iov_iter *to)  
    230 {  
    231         size_t total_len = iov_iter_count(to);  
    232         struct file *filp = iocb->ki_filp;  
    233         struct pipe_inode_info *pipe = filp->private_data;  
    234         int do_wakeup;  
    235         ssize_t ret;  
    236   
    237         /* Null read succeeds. */  
    238         if (unlikely(total_len == 0))  
    239                 return 0;  
    240   
    241         do_wakeup = 0;  
    242         ret = 0;  
    243         __pipe_lock(pipe);  
    244         for (;;) {  
    245                 int bufs = pipe->nrbufs;  
    246                 if (bufs) {  
    247                         int curbuf = pipe->curbuf;  
    248                         struct pipe_buffer *buf = pipe->bufs + curbuf;  
    249                         const struct pipe_buf_operations *ops = buf->ops;  
    250                         size_t chars = buf->len;  
    251                         size_t written;  
    252                         int error;  
    253   
    254                         if (chars > total_len)  
    255                                 chars = total_len;  
    256   
    257                         error = ops->confirm(pipe, buf);  
    258                         if (error) {  
    259                                 if (!ret)  
    260                                         ret = error;  
    261                                 break;  
    262                         }  
    263   
    264                         written = copy_page_to_iter(buf->page, buf->offset, chars, to);  
    265                         if (unlikely(written < chars)) {  
    266                                 if (!ret)  
    267                                         ret = -EFAULT;  
    268                                 break;  
    269                         }  
    270                         ret += chars;  
    271                         buf->offset += chars;  
    272                         buf->len -= chars;  
    273   
    274                         /* Was it a packet buffer? Clean up and exit */  
    275                         if (buf->flags & PIPE_BUF_FLAG_PACKET) {  
    276                                 total_len = chars;  
    277                                 buf->len = 0;  
    278                         }  
    279   
    280                         if (!buf->len) {  
    281                                 buf->ops = NULL;  
    282                                 ops->release(pipe, buf);  
    283                                 curbuf = (curbuf + 1) & (pipe->buffers - 1);  
    284                                 pipe->curbuf = curbuf;  
    285                                 pipe->nrbufs = --bufs;  
    286                                 do_wakeup = 1;  
    287                         }  
    288                         total_len -= chars;  
    289                         if (!total_len)  
    290                                 break;  /* common path: read succeeded */  
    291                 }  
    292                 if (bufs)       /* More to do? */  
    293                         continue;  
    294                 if (!pipe->writers)  
    295                         break;  
    296                 if (!pipe->waiting_writers) {  
    297                         /* syscall merging: Usually we must not sleep 
    298                          * if O_NONBLOCK is set, or if we got some data. 
    299                          * But if a writer sleeps in kernel space, then 
    300                          * we can wait for that data without violating POSIX. 
    301                          */  
    302                         if (ret)  
    303                                 break;  
    304                         if (filp->f_flags & O_NONBLOCK) {  
    305                                 ret = -EAGAIN;  
    306                                 break;  
    307                         }  
    308                 }  
    309                 if (signal_pending(current)) {  
    310                         if (!ret)  
    311                                 ret = -ERESTARTSYS;  
    312                         break;  
    313                 }  
    314                 if (do_wakeup) {  
    315                         wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);  
    316                         kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);  
    317                 }  
    318                 pipe_wait(pipe);  
    319         }  
    320         __pipe_unlock(pipe);  
    321   
    322         /* Signal writers asynchronously that there is more room. */  
    323         if (do_wakeup) {  
    324                 wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM);  
    325                 kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);  
    326         }  
    327         if (ret > 0)  
    328                 file_accessed(filp);  
    329         return ret;  
    330 }

可以看到Pipe被組織成環狀結構，即一個循環鏈表。鏈表中的元素為struct pipe_buffer的結構，每個pipe_buffer對於一個page。鏈表中共有16個元素，即pipe buffer的總大小為16*page。如果page大小為4K，那么pipe buffer的總大小應該為16*4K=64K。