EPIPE和ECONNRESET


page1:

假設Server A上面有Process X,它有一個socket M,和另外的Server B上面的Process Y的 Socket N以TCP協議連接上了,那么,據我所知,有2種情況會出現RST包:

(1)X沒有close socket就退出了,然后Y繼續向M send數據,A的內核就會發送RST 到 socket N;

(2)X設置了SO_LINGER,其中l_onoff 非0, l_linger 為0,這樣當A close socket M的時候,也會發送RST到socket N。

當socket N收到了RST,select的結果為socket可讀,則:

(a)如果這個時候調用recv,返回-1,errno為ECONNRESET,如果再次調用recv,返回-1,errno為EPIPE,同事產生EPIPE信號;

(b)如果這個時候調用send,返回-1,errno為EPIPE,同時會產生SIGPIPE信號。

 

 

page2:

好久沒做過C開發了,最近重操舊業。聽說另外一個項目組socket開發遇到問題,發送端和接受端數據大小不一致。建議他們采用writen的重發機制,以避免信號中斷錯誤。采用后還是有問題。PM讓我幫忙研究下。UNP n年以前看過,很久沒做過底層開發,手邊也沒有UNP vol1這本書,所以做了個測試程序,研究下實際可能發生的情況了。 測試環境:AS3和redhat 9(缺省沒有nc) 先下載unp源碼:wget http://www.unpbook.com/unpv13e.tar.gztar xzvf *.tar.gz;configure;make lib.然后參考str_cli.c和tcpcli01.c,寫了測試代碼client.c

#include    "unp.h"

#define MAXBUF 40960
void processSignal(int signo)
{
    printf("Signal is %d\n", signo);
    signal(signo, processSignal);
}
void
str_cli(FILE *fp, int sockfd)
{
    char     sendline[MAXBUF], recvline[MAXBUF];

    while (1) {

        memset(sendline, 'a', sizeof(sendline));
        printf("Begin send %d data\n", MAXBUF);
         Writen(sockfd, sendline, sizeof(sendline));
        sleep(5);

    }
}

int
main(int argc, char **argv)
{
    int                     sockfd;
    struct sockaddr_in     servaddr;

    signal(SIGPIPE, SIG_IGN);
    //signal(SIGPIPE, processSignal);

    if (argc != 2)
         err_quit("usage: tcpcli [port]");

     sockfd = Socket(AF_INET, SOCK_STREAM, 0);

     bzero(&servaddr, sizeof(servaddr));
     servaddr.sin_family = AF_INET;
     servaddr.sin_port = htons(atoi(argv[1]));
     Inet_pton(AF_INET, "127.0.0.1", &servaddr.sin_addr);

    Connect(sockfd, (SA *) &servaddr, sizeof(servaddr));

     str_cli(stdin, sockfd);        /* do it all */

    exit(0);
}

 

 

為了方便觀察錯誤輸出,lib/writen.c也做了修改,加了些日志:

 

 

/* include writen */
#include    "unp.h"

ssize_t                        /* Write "n" bytes to a descriptor. */
writen(int fd, const void *vptr, size_t n)
{
    size_t         nleft;
     ssize_t         nwritten;
    const char    *ptr;

     ptr = vptr;
     nleft = n;
    while (nleft > 0) {
        printf("Begin Writen %d\n", nleft);
        if ( (nwritten = write(fd, ptr, nleft)) <= 0) {
            if (nwritten < 0 && errno == EINTR) {
                printf("intterupt\n");
                 nwritten = 0;        /* and call write() again */
            }
            else
                return(-1);            /* error */
        }

         nleft -= nwritten;
         ptr += nwritten;
        printf("Already write %d, left %d, errno=%d\n", nwritten, nleft, errno);
    }
    return(n);
}
/* end writen */

void
Writen(int fd, void *ptr, size_t nbytes)
{
    if (writen(fd, ptr, nbytes) != nbytes)
         err_sys("writen error");
}

 

 

client.c放在tcpclieserv目錄下,修改了Makefile,增加了client.c的編譯目標

 


client: client.c
                 ${CC} ${CFLAGS} -o $@ $< ${LIBS}

 

 

接着就可以開始測試了。

測試1 忽略SIGPIPE信號,writen之前,對方關閉接受進程

本機服務端:

nc -l -p 30000 本機客戶端:./client 30000Begin send 40960 data
Begin Writen 40960
Already write 40960, left 0, errno=0
Begin send 40960 data
Begin Writen 40960
Already write 40960, left 0, errno=0
執行到上步停止服務端,client會繼續顯示:Begin send 40960 data
Begin Writen 40960
writen error: Broken pipe(32)
結論:可見write之前,對方socket中斷,發送端write會返回-1,errno號為EPIPE(32)測試2 catch SIGPIPE信號,writen之前,對方關閉接受進程

修改客戶端代碼,catch sigpipe信號

        //signal(SIGPIPE, SIG_IGN);

        signal(SIGPIPE, processSignal);

 

本機服務端:

nc -l -p 30000 本機客戶端:make client./client 30000Begin send 40960 data
Begin Writen 40960
Already write 40960, left 0, errno=0
Begin send 40960 data
Begin Writen 40960
Already write 40960, left 0, errno=0
執行到上步停止服務端,client會繼續顯示:Begin send 40960 data
Begin Writen 40960
Signal is 13
writen error: Broken pipe(32)
結論:可見write之前,對方socket中斷,發送端write時,會先調用SIGPIPE響應函數,然后write返回-1,errno號為EPIPE(32) 測試3 writen過程中,對方關閉接受進程

為了方便操作,加大1次write的數據量,修改MAXBUF為4096000

本機服務端:

nc -l -p 30000 本機客戶端:make client./client 30000Begin send 4096000 data
Begin Writen 4096000
執行到上步停止服務端,client會繼續顯示:Already write 589821, left 3506179, errno=0
Begin Writen 3506179
writen error: Connection reset by peer(104)

結論:可見socket write中,對方socket中斷,發送端write會先返回已經發送的字節數,再次write時返回-1,errno號為ECONNRESET(104)

為什么以上測試,都是對方已經中斷socket后,發送端再次write,結果會有所不同呢。從后來找到的UNP5.12,5.13能找到答案

The client's call to readline may happen before the server's RST is received by the client, or it may happen after. If the readline happens before the RST is received, as we've shown in our example, the result is an unexpected EOF in the client. But if the RST arrives first, the result is an ECONNRESET ("Connection reset by peer") error return from readline.

 

以上解釋了測試3的現象,write時,收到RST.

What happens if the client ignores the error return from readline and writes more data to the server? This can happen, for example, if the client needs to perform two writes to the server before reading anything back, with the first write eliciting the RST.

The rule that applies is: When a process writes to a socket that has received an RST, the SIGPIPE signal is sent to the process. The default action of this signal is to terminate the process, so the process must catch the signal to avoid being involuntarily terminated.

If the process either catches the signal and returns from the signal handler, or ignores the signal, the write operation returns EPIPE.

 

以上解釋了測試1,2的現象,write一個已經接受到RST的socket,系統內核會發送SIGPIPE給發送進程,如果進程catch/ignore這個信號,write都返回EPIPE錯誤.

因此,UNP建議應用根據需要處理SIGPIPE信號,至少不要用系統缺省的處理方式處理這個信號,系統缺省的處理方式是退出進程,這樣你的應用就很難查處處理進程為什么退出。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM