(轉)常見Nginx服務器故障處理。

本文轉載自查看原文 2017-09-20 15:17 4268

目的：
在Nginx服務器出現故障時，能快速定位並解決相關錯誤。

概述：
Nginx常見錯誤與問題之解決方法技術指南。

安裝環境：
系統環境：redhat enterprise 6.5 64bit

1、Nginx 常見啟動錯誤

有的時候初次安裝nginx的時候會報這樣的錯誤

sbin/nginx -c conf/nginx.conf

報錯內容：sbin/nginx: error while loading shared libraries: libpcre.so.1:

cannot open shared object file: No such file or directory

啟動時如果報異常error while loading shared libraries: libpcre.so.1: cannot open

shared object file: No such file or directory 這說明我們的環境還不是和啟動需要

小小的配置一下
解決方法(直接運行)：
32位系統 [root@sever lib]# ln -s /usr/local/lib/libpcre.so.1 /lib
64位系統 [root@sever lib]# ln -s /usr/local/lib/libpcre.so.1 /lib64

然后執行ps -ef | grep nginx 查看nginx進程確認是否真的已經啟動了，在進程列表里會

有最起碼兩個， worker(nginx工作進程)和master（nginx主進程）

root 4349 1 0 02:24 ? 00:00:00 nginx: master process sbin/nginx -c

conf/nginx.conf
nginx 4350 4349 0 02:24 ? 00:00:00 nginx: worker process
root 4356 28335 0 02:30 pts/1 00:00:00 grep nginx

NGINX 就 OK了

2、400 bad request錯誤的原因和解決辦法

配置nginx.conf相關設置如下.

client_header_buffer_size 16k;
large_client_header_buffers 4 64k;

根據具體情況調整，一般適當調整值就可以。

3、Nginx 502 Bad Gateway錯誤

在php.ini和php-fpm.conf中分別有這樣兩個配置項：max_execution_time和request_terminate_timeout。
這兩項都是用來配置一個PHP腳本的最大執行時間的。當超過這個時間時，PHP-FPM不只會終止腳本的執行，
還會終止執行腳本的Worker進程。所以Nginx會發現與自己通信的連接斷掉了，就會返回給客戶端502錯誤。

以PHP-FPM的request_terminate_timeout=30秒時為例，報502 Bad Gateway錯誤的具體信息如下：
1）Nginx錯誤訪問日志：
2013/09/19 01:09:00 [error] 27600#0: *78887 recv() failed (104: Connection reset by peer) while reading response header from upstream,
client: 192.168.1.101, server: test.com, request: "POST /index.php HTTP/1.1", upstream: "fastcgi://unix:/dev/shm/php-fcgi.sock:",
host: "test.com", referrer: " http://test.com/index.php"

2）PHP-FPM報錯日志：
WARNING: child 25708 exited on signal 15 (SIGTERM) after 21008.883410 seconds from start

所以只需將這兩項的值調大一些就可以讓PHP腳本不會因為執行時間長而被終止了。request_terminate_timeout可以覆蓋max_execution_time，
所以如果不想改全局的php.ini，那只改PHP-FPM的配置就可以了。

此外要注意的是Nginx的upstream模塊中的max_fail和fail_timeout兩項。有時Nginx與上游服務器（如Tomcat、FastCGI）的通信只是偶然斷掉了，
但max_fail如果設置的比較小的話，那么在接下來的fail_timeout時間內，Nginx都會認為上游服務器掛掉了，都會返回502錯誤。
所以可以將max_fail調大一些，將fail_timeout調小一些。

4、Nginx出現的413 Request Entity Too Large錯誤

這個錯誤一般在上傳文件的時候會出現，

編輯Nginx主配置文件Nginx.conf，找到http{}段，添加

client_max_body_size 10m; //設置多大根據自己的需求作調整.

如果運行php的話這個大小client_max_body_size要和php.ini中的如下值的最大值一致或

者稍大，這樣就不會因為提交數據大小不一致出現的錯誤。

post_max_size = 10M
upload_max_filesize = 2M

5、解決504 Gateway Time-out(nginx)

遇到這個問題是在升級discuz論壇的時候遇到的一般看來, 這種情況可能是由於nginx默認的

fastcgi進程響應的緩沖區太小造成的, 這將導致fastcgi進程被掛起, 如果你的fastcgi服務

對這個掛起處理的不好, 那么最后就極有可能導致504 Gateway Time-out,現在的網站, 尤其某

些論壇有大量的回復和很多內容的, 一個頁面甚至有幾百K。默認的fastcgi進程響應的緩沖區

是8K, 我們可以設置大點在nginx.conf里, 加入： fastcgi_buffers 8 128k這表示設置

fastcgi緩沖區為8×128

當然如果您在進行某一項即時的操作, 可能需要nginx的超時參數調大點，例如設置成90秒：

send_timeout 90;只是調整了這兩個參數, 結果就是沒有再顯示那個超時, 效果不錯

Nginx中關於與上游服務器通信超時時間的配置factcgi_connect/read/send_timeout。

以Nginx超時時間為90秒，PHP-FPM超時時間為300秒為例，報504 Gateway Timeout錯誤時的Nginx錯誤訪問日志如下：
2013/09/19 00:55:51 [error] 27600#0: *78877 upstream timed out (110: Connection timed out) while reading response header from upstream,
client: 192.168.1.101, server: test.com, request: "POST /index.php HTTP/1.1", upstream: "fastcgi://unix:/dev/shm/php-fcgi.sock:",
host: "test.com", referrer: " http://test.com/index.php"

調高這三項的值（主要是read和send兩項，默認不配置的話Nginx會將超時時間設為60秒）之后，504錯誤也解決了。
而且這三項配置可以配置在http、server級別，也可以配置在location級別。擔心影響其他應用的話，就配置在自己應用的location中吧。
要注意的是factcgi_connect/read/send_timeout是對FastCGI生效的，而proxy_connect/read/send_timeout是對proxy_pass生效的。

配置舉例：
location ~ \.php$ {
root /home/cdai/test.com;
include fastcgi_params;
fastcgi_connect_timeout 180;
fastcgi_read_timeout 600;
fastcgi_send_timeout 600;
fastcgi_pass unix:/dev/shm/php-fcgi.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /home/cdai/test.com$fastcgi_script_name;
}

6、如何使用Nginx Proxy

朋友一台服務器運行tomcat 為8080端口,IP:192.168.1.2:8080,另一台機器

IP:192.168.1.8. 朋友想通過訪問 http://192.168.1.8即可訪問tomcat服務.配置如下:

在192.168.1.8的nginx.conf上配置如下:

server {
listen 80;
server_name java.linuxtone.org
location / {
proxy_pass http://192.168.1.2:8080;
include /usr/local/nginx/conf/proxy.conf;
}
}
7. 安裝完成Nginx后無法站外訪問？
剛安裝好nginx一個常見的問題是無法站外訪問，本機wget、telnet都正常。而服務器之外，不管是局域網的其它主機還是互聯網的主機都無法訪問站點。如果用telnet的話，提示：

正在連接到192.168.0.xxx...不能打開到主機的連接，在端口 80: 連接失敗

如果用wget命令的話，提示：

Connecting to 192.168.0.100:80... failed: No route to host.

如果是以上的故障現象，很可能是被CentOS的防火牆把80端口攔住了，嘗試執行以下命令，打開80端口：

iptables -I INPUT -p tcp --dport 80 -j ACCEPT

然后用：

/etc/init.d/iptables status

查看當前的防火牆規則，如果發現有這樣一條：

ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80

就說明防火牆規則已經添加成功了，再在站外訪問就正常了。

8、如何關閉Nginx的LOG

access_log /dev/null
error_log /dev/null

此外，錯誤日志主要記錄客戶端訪問nginx出錯時的日志，通過錯誤日志，能快速定位客戶端訪問異常！

錯誤信息

錯誤說明

"upstream prematurely（過早的） closed connection"

請求uri的時候出現的異常，是由於upstream還未返回應答給用戶時用戶斷掉連接造成的，對系統沒有影響，可以忽略

"recv() failed (104: Connection reset by peer)"

（1）服務器的並發連接數超過了其承載量，服務器會將其中一些連接Down掉；
（2）客戶關掉了瀏覽器，而服務器還在給客戶端發送數據；
（3）瀏覽器端按了Stop

"(111: Connection refused) while connecting to upstream"

用戶在連接時，若遇到后端upstream掛掉或者不通，會收到該錯誤

"(111: Connection refused) while reading response header from upstream"

用戶在連接成功后讀取數據時，若遇到后端upstream掛掉或者不通，會收到該錯誤

"(111: Connection refused) while sending request to upstream"

Nginx和upstream連接成功后發送數據時，若遇到后端upstream掛掉或者不通，會收到該錯誤

"(110: Connection timed out) while connecting to upstream"

nginx連接后面的upstream時超時

"(110: Connection timed out) while reading upstream"

nginx讀取來自upstream的響應時超時

"(110: Connection timed out) while reading response header from upstream"

nginx讀取來自upstream的響應頭時超時

"(110: Connection timed out) while reading upstream"

nginx讀取來自upstream的響應時超時

"(104: Connection reset by peer) while connecting to upstream"

upstream發送了RST，將連接重置

"upstream sent invalid header while reading response header from upstream"

upstream發送的響應頭無效

"upstream sent no valid HTTP/1.0 header while reading response header from upstream"

upstream發送的響應頭無效

"client intended to send too large body"

用於設置允許接受的客戶端請求內容的最大值，默認值是1M，client發送的body超過了設置值

"reopening logs"

用戶發送kill -USR1命令

"gracefully shutting down",

用戶發送kill -WINCH命令

"no servers are inside upstream"

upstream下未配置server

"no live upstreams while connecting to upstream"

upstream下的server全都掛了

"SSL_do_handshake() failed"

SSL握手失敗

"ngx_slab_alloc() failed: no memory in SSL session shared cache"

ssl_session_cache大小不夠等原因造成

"could not add new SSL session to the session cache while SSL handshaking"

ssl_session_cache大小不夠等原因造成

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 天翼雲服務器故障處理記一次惠普服務器故障處理流程（待再整理） confluence服務器的一次系統崩潰引起的confluence.cfg.xml配置文件丟失故障處理 NFS故障處理【微服務No.2】polly微服務故障處理庫 Squid調試和故障處理故障處理流程和規范 DMHS原理和故障處理 linux故障處理--系統故障 mac搭配Nginx服務器常見問題