Nginx 502 bad gateway and Nginx 504 Gateway Time out error resolution

  • 2020-05-06 12:06:06
  • OfStack

Recently, the phenomenon of server downtime is relatively frequent. At the end of the day, G was hung, 502 Bad Gateway Nginx, which reminds me of the previous 504 Gateway Time-out. The two should have some connection, we must make sure. Nginx 504 Gateway Time-out means that the requested gateway has not been requested, which simply means that the requested gateway has not been requested to the executable PHP-CGI.

In general, Nginx 502 Bad Gateway and php-fpm.conf are related to the Settings, while Nginx 504 Gateway Time-out is related to the Settings of nginx.conf.

Nginx 504 Gateway has been documented in a previous article, so let's ignore it for the moment and talk about the solution of 502 bad gateway. The most important thing is the setting of php-fpm.conf. Es38en-fpm.conf has two crucial parameters, one is "max_children" and the other is "request_terminate_timeout", and these two values need to be calculated.

If your server performance is good enough and the bandwidth is sufficient, you can set "request_terminate_timeout" to 0s if the PHP script does not have a loop or BUG. 0s means to keep PHP-CGI running without a time limit. If you can't do this, which means that your PHP-CGI might have some BUG, or you don't have enough broadband, or for some other reason your PHP-CGI can feei dead, then it is recommended that you assign a value to "request_terminate_timeout", which can be set according to the performance of your server. In general, the better the performance, the higher you can set it.

How was the value "max_children" calculated? In principle, the larger the better, the faster the php-cgi processes and the fewer requests queued. Setting "max_children" also needs to be set according to the performance of the server. If the memory consumed by each php-cgi is about 20M, "max_children" is set to 80, 20M*80=1600M, that is, the memory consumed by all PHP-CGI is within 1600M at the peak, which is less than the effective memory.

If "max_children" is set to a small number, such as 5-10, then php-cgi will be "very tired", with slow processing speed and long waiting time. If the request has not been processed for a long time, the error will appear as 504 Gateway Time-out, while the "very tired" es95-cgi will appear as 502 Bad gateway if they encounter problems.

The following is a more detailed introduction to :

Some
websites running on Nginx sometimes have "502 Bad Gateway" errors, and sometimes they even appear frequently. Here are some error troubleshooting methods for Nginx 502 collected and collated by this site for your reference:

The reason for the Nginx 502 error is more due to a problem with the back-end server in proxy mode. These errors are generally not the fault of nginx itself, be sure to find the reason from the back end! But the fact that nginx blamed itself for all these errors makes the promoters of nginx suspicious. After all, bad gateway? Isn't that bad nginx? I hope that the next version of nginx will make the error prompt a little more friendly, at least not as simple as 502 Bad Gateway.

Trigger condition
for Nginx 502
The most common occurrence of an
502 error is a backend host crash. There is a configuration in the upstream configuration: proxy_next_upstream. This configuration specifies that nginx will move to the next back-end host when it encounters any errors in fetching data from one back-end host. It says that all cases of 502 will be pulled, and the default is error timeout. error means dead, broken, etc. timeout means read blocked timeout, which is easy to understand. I usually write it all:
 
proxy_next_upstream error timeout invalid_header http_500 http_503;  

But now maybe I'll remove the http_500, http_500 specifies that when the backend returns 500 errors, it goes to a host, and when the backend jsp fails, it would have printed a bunch of stacktrace errors, but now it's replaced by 502. But the company's programmers didn't think so. They decided that nginx had made a mistake.

The 503 error can be kept, because the backend is usually apache resin, if apache crashes it is error, but resin crashes, it is only 503, so it is necessary to keep it.

Solution

When encounters a 502 problem, you can prioritize the following two steps to solve it.

See if the current PHP FastCGI processes are sufficient:
 
netstat -anpo | grep "php-cgi" | wc -l 

If the actual number of FastCGI processes used is close to the preset number of FastCGI processes, then the number of FastCGI processes is insufficient and needs to be increased.

2. The execution time of some PHP programs exceeds the waiting time of Nginx. The timeout time of FastCGI in nginx
 
http { 
fastcgi_connect_timeout 300; 
fastcgi_send_timeout 300; 
fastcgi_read_timeout 300; 
...... 
} 
...... 

Es197en.ini memory_limit set low will make an error, php.ini memory_limit is changed to 64M, restart nginx, and find that PHP is out of memory.

If this modification does not solve the problem, you can refer to the following schemes:

I, max-children and max-requests

runs nginx php(fpm) xcache on a server, with an average daily traffic of around 300W pv.

Recently, it has often happened that the php page opens slowly, the usage of cpu suddenly drops to a low level, the system load suddenly rises to a high level, and when you look at the network card traffic, you will also find that it suddenly drops to a low level. It only lasted a few seconds before it recovered.

Checking the log files of php-fpm reveals some clues.
 
Sep 30 08:32:23.289973 [NOTICE] fpm_unix_init_main(), line 271: getrlimit(nofile): max:51200, cur:51200 Sep 30 08:32:23.290212 [NOTICE] fpm_sockets_init_main(), line 371: using inherited socket fd=10,  " 127.0.0.1:9000 "  Sep 30 08:32:23.290342 [NOTICE] fpm_event_init_main(), line 109: libevent: using epoll Sep 30 08:32:23.296426 [NOTICE] fpm_init(), line 47: fpm is running, pid 30587 

At the beginning of these sentences, more than 1000 lines close children and open children's log.

It turns out that php-fpm has an max_requests parameter that specifies the maximum number of requests per children to be closed, and the default setting is 500. Since php polls requests to each children, it takes about the same time for each childre to reach max_requests under heavy traffic, resulting in all children being shut down at roughly the same time.

During this time, nginx cannot transfer the php file to php-fpm for processing, so cpu will drop to very low levels (no php processing, much less sql execution), and the load will rise to very high levels (turn children off and on, nginx waiting for php-fpm), and the network card traffic will drop to very low levels (nginx cannot generate data to transfer to the client),

The solution is simply to increase the number of children and set max_requests to non-0 or a larger value:

Open/usr local/php etc/php - fpm. conf up the following two parameters (according to the actual circumstance of the server, not too much)
 
<value name="max_children">5120</value> <value name="max_requests">600</value>   

Then restart php-fpm.

2. Increase the buffer capacity of

Open nginx error log and find "pstream sent too big header reading response header from upstream". There is an bug buffer in the nginx buffer. The page consumption of our website may be too large. The revision method that the reference foreigner writes increased buffer capacity size to set, 502 problem is solved thoroughly. Later, the system administrator adjusted the parameters and kept only two parameters: client head buffer, fastcgi buffer size.

3, request_terminate_timeout

If
is primarily the case for 502 in some post or database operations, rather than in static page operations, then you can check the php-fpm.conf setting:

request_terminate_timeout

This value is max_execution_time, which is the script execution time of fast-cgi.

0s

0s is closed, which means it can execute indefinitely. (changed a number without looking at it carefully when loading) the problem is solved, and the execution will not go wrong for a long time. To optimize fastcgi, you can also change this value to 5s to see what it looks like.

A 502 error occurs when php-cgi processes are insufficient, php takes too long to execute, or php-cgi processes die.

Related articles: