TCP_KEEPALIVE選項只是一個開關,Linux中默認的Keepalive的選項如下:
$sudo sysctl -a | grep keepalive net.ipv4.tcp_keepalive_time = 7200 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_intvl = 75
上文中的keepalive選項表示如果一個連接上7200s后沒有任何數據發送,則設置了這個選項的本端向對端發送keepalive保活報文,它會有如下三種結果:
- 對端回復ACK。則本端TCP認為該連接依然存活。繼續等7200s后再發送keepalive報文。
- 對端回復RESET。說明對端進程已經重啟,本端的應用程序應該關閉該連接。
- 沒有對端的任何回復。則本端做重試,如果重試9次(前后重試間隔為75秒)仍然不可達,則向應用程序返回錯誤信息,ETIMEOUT(無任何應答)或EHOST
如果應用程序向改變keepalive的默認行為,該怎么辦呢?答案就是利用 TCP_KEEPIDLE、TCP_KEEPINTVL、TCP_KEEPCNT這幾個TCP選項,首先看看如何使用:
int setKeepAlive(int fd, int interval) { int val = 1; if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &val, sizeof(val)) == -1) { printf("setsockopt SO_KEEPALIVE: %s", strerror(errno)); return -1; } /* Send first probe after `interval' seconds. */ val = interval; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &val, sizeof(val)) < 0) { printf("setsockopt TCP_KEEPIDLE: %s\n", strerror(errno)); return -1; } /* Send next probes after the specified interval. Note that we set the * delay as interval / 3, as we send three probes before detecting * an error (see the next setsockopt call). */ val = interval/3; if (val == 0) val = 1; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &val, sizeof(val)) < 0) { printf("setsockopt TCP_KEEPINTVL: %s\n", strerror(errno)); return -1; } /* Consider the socket in error state after three we send three ACK * probes without getting a reply. */ val = 3; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &val, sizeof(val)) < 0) { printf("setsockopt TCP_KEEPCNT: %s\n", strerror(errno)); return -1; } return 0; }
- TCP_KEEPDILE 設置連接上如果沒有數據發送的話,多久后發送keepalive探測分組,單位是秒
- TCP_KEEPINTVL 前后兩次探測之間的時間間隔,單位是秒
- TCP_KEEPCNT 關閉一個非活躍連接之前的最大重試次數
用tcpdump抓包就可以看到設置了上面選項的那端的詳細行為。