http://yangjunwei.com/a/723.html
分析Centos系統下LNMP頻繁502 Bad Gateway問題
最近VPS總是出現 Nginx 502 Bad Gateway 錯誤,導致網頁無法正常訪問,但FTP和SSH正常連接,很是傷腦筋!這次好好整治一下!
根據問題,應該是 php-fpm 出了問題,先查看日志文件 /usr/local/php/logs/php-fpm.log
日志內容大致如下(借用了一下28、29兩天的記錄):
Jan 28 22:50:00.309235 [NOTICE] fpm_unix_init_main(), line 284: getrlimit(nofile): max:1024, cur:1024 Jan 28 22:50:00.309552 [NOTICE] fpm_event_init_main(), line 88: libevent: using epoll Jan 28 22:50:00.309617 [NOTICE] fpm_init(), line 52: fpm is running, pid 7967 Jan 28 22:50:00.310444 [NOTICE] fpm_children_make(), line 352: child 7968 (pool default) started Jan 28 22:50:00.311328 [NOTICE] fpm_children_make(), line 352: child 7969 (pool default) started Jan 28 22:50:00.312208 [NOTICE] fpm_children_make(), line 352: child 7970 (pool default) started Jan 28 22:50:00.313161 [NOTICE] fpm_children_make(), line 352: child 7971 (pool default) started Jan 28 22:50:00.314210 [NOTICE] fpm_children_make(), line 352: child 7972 (pool default) started Jan 28 22:50:00.314242 [NOTICE] fpm_event_loop(), line 107: libevent: entering main loop Jan 29 16:58:30.845059 [NOTICE] fpm_got_signal(), line 70: received SIGUSR2 Jan 29 16:58:30.856418 [NOTICE] fpm_pctl(), line 256: switching to 'reloading' state Jan 29 16:58:30.856449 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 3 SIGQUIT to child 7972 (pool default) Jan 29 16:58:30.856463 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 3 SIGQUIT to child 7971 (pool default) Jan 29 16:58:30.856475 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 3 SIGQUIT to child 7970 (pool default) Jan 29 16:58:30.856487 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 3 SIGQUIT to child 7969 (pool default) Jan 29 16:58:30.856537 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 3 SIGQUIT to child 7968 (pool default) Jan 29 16:58:30.856546 [NOTICE] fpm_pctl_kill_all(), line 181: 5 children are still alive Jan 29 16:58:35.845629 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 15 SIGTERM to child 7972 (pool default) Jan 29 16:58:35.845662 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 15 SIGTERM to child 7971 (pool default) Jan 29 16:58:35.845671 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 15 SIGTERM to child 7970 (pool default) Jan 29 16:58:35.845678 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 15 SIGTERM to child 7969 (pool default) Jan 29 16:58:35.845686 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 15 SIGTERM to child 7968 (pool default) Jan 29 16:58:35.845691 [NOTICE] fpm_pctl_kill_all(), line 181: 5 children are still alive Jan 29 16:58:36.450929 [NOTICE] fpm_got_signal(), line 48: received SIGCHLD Jan 29 16:58:36.451016 [WARNING] fpm_children_bury(), line 215: child 7968 (pool default) exited on signal 15 SIGTERM after 65319.297391 seconds from start Jan 29 16:58:36.451062 [WARNING] fpm_children_bury(), line 215: child 7971 (pool default) exited on signal 15 SIGTERM after 65319.294705 seconds from start Jan 29 16:58:36.451088 [WARNING] fpm_children_bury(), line 215: child 7972 (pool default) exited on signal 15 SIGTERM after 65319.293684 seconds from start Jan 29 16:58:36.451103 [NOTICE] fpm_got_signal(), line 48: received SIGCHLD Jan 29 16:58:36.451834 [NOTICE] fpm_got_signal(), line 48: received SIGCHLD Jan 29 16:58:36.451868 [WARNING] fpm_children_bury(), line 215: child 7969 (pool default) exited on signal 15 SIGTERM after 65319.297343 seconds from start Jan 29 16:58:36.451891 [WARNING] fpm_children_bury(), line 215: child 7970 (pool default) exited on signal 15 SIGTERM after 65319.296487 seconds from start Jan 29 16:58:36.451903 [NOTICE] fpm_pctl_exec(), line 95: reloading: execvp("/usr/local/php/bin/php-cgi", {"/usr/local/php/bin/php-cgi", "--fpm", "--fpm-config", "/usr/local/php/etc/php-fpm.conf"})
滿眼的NOTICE錯誤,據觀察至后幾天,錯誤日志都是如此!據網絡資料分析說,這類錯誤大都是由於php線程打開文件句柄受限導致的錯誤,這里綜合各位童鞋的分析,整理記錄如下,希望能解決此類 502 問題!
首先檢查一下ulimit -n的值,SSH輸入命令:
# ulimit -n 返回:65535
1、提升服務器的文件句柄打開
SSH命令:# vi /etc/security/limits.conf,在結尾處添加以下內容:
* soft nofile 65535 * hard nofile 65535
2、提升nginx的進程文件打開數
# vi /usr/local/nginx/conf/nginx.conf 查看 worker_rlimit_nofile 51200;
3、修改 php-fpm.conf 配置文件
前面確認了 ulimit -n 值為 65535,/usr/local/php/etc/php-fpm.conf 中的選項 rlimit_files 確保和此數值一致。
<value name="rlimit_files">65535</value> <value name="max_requests">10240</value>
4、修改 sysctl.conf
# vi /etc/sysctl.conf
底部添加
fs.file-max=65535
至此,重啟 /root/lnmp restart 生效,看看還有沒有類似錯誤信息出現!
ps.為減小php-fpm.log文件大小,可將 /usr/local/php/etc/php-fpm.conf 中的 Log level 由 notice 修改為 ERROR,這樣能降低日志的生成速度!
Log level <value name="log_level">Error</value>
2012.02.20更新:貌似 502 Bad Gateway 已經消失了!!