Detail the optimization of nginx high concurrency scenarios

  • 2020-05-15 03:38:41
  • OfStack

In daily operation and maintenance work, nginx services are often used, and performance bottlenecks caused by high concurrency of nginx are often encountered. Today, I will briefly review the configuration of nginx performance optimization (only based on my actual combat experience, please point out if there is any problem).

1. The optimization here mainly refers to the configuration optimization of nginx. Generally speaking, the following items are useful for the optimization in nginx configuration files:

1) number of nginx processes. It is recommended to specify the number of cpu. Generally, 1 is the same as or a multiple of cpu audit.


worker_processes 8;

2) assign cpu to each process. In the above example, 8 processes are assigned to 8 cpu. Of course, you can write multiple, or assign 1 process to multiple cpu.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;

3) the following instruction refers to the maximum number of file descriptors opened by an nginx process. The theoretical value should be the maximum number of open files of the system (ulimit-n) divided by the number of nginx processes. However, the request allocation of nginx is not so uniform, so it is better to keep 1 to the value of ulimit-n.


worker_rlimit_nofile 65535;

4) use epoll's I/O model to efficiently handle asynchronous events


use epoll;

5) the maximum number of connections allowed per process, theoretically the maximum number of connections per nginx server is worker_processes*worker_connections.


worker_connections 65535;

6) http connection timeout, default is 60s, the function is to make the connection from the client to the server remain valid within the set time, when there are subsequent requests to the server, this function avoids establishing or re-establishing the connection. Remember this parameter should not be set too large! Otherwise, many invalid http connections will take up the number of nginx connections, and eventually nginx will crash!


keepalive_timeout 60;

7) the buffer size of the client request header can be set according to the page size of your system. The header size of 1 general request will not exceed 1k. However, since the page size of 1 general system is larger than 1k, it is set to the page size here. The paging size can be obtained with the command getconf PAGESIZE.


client_header_buffer_size 4k;

8) the following parameter will specify the cache for opening the file, which is not enabled by default. max specifies the number of caches, and it is recommended that the number of open files should be 1. inactive refers to how long it takes for the file not to be requested before deleting the cache.


open_file_cache max=102400 inactive=20s;

9) the following refers to how often the cache is checked for valid information.


open_file_cache_valid 30s;

10) the minimum number of times a file is used in open_file_cache. If this number is exceeded, the file descriptor 1 is opened in the cache. In the example above, if a file is not used once in inactive, it will be removed.


open_file_cache_min_uses 1;

11) hiding information about the operating system and web server (Nginx) version number in the response header is a security benefit.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
0

12) can make sendfile() work. sendfile() can copy data (or any two file descriptors) between disk and TCP socket. Pre-sendfile is to request a data buffer in user space before transferring data. The data is then copied from the file to this buffer using read(), and write() writes the buffer data to the network. sendfile() immediately reads data from disk to the OS cache. Because this copy is done in the kernel, sendfile() is more efficient than combining read() and write() and turning on and off discard buffers (more on sendfile).


sendfile on;

13) tell nginx to send all header files in one packet, not one after another. That is to say, the packet will not be sent out immediately, until the packet maximum, a one-time transmission, which helps to solve the network congestion.


tcp_nopush on; 

14) tell nginx not to cache data, but to send data in segments 1 and 1 -- when data needs to be sent in a timely manner, this property should be set for the application, so that when sending a small piece of data information, the return value cannot be immediately obtained.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
3

Such as:


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
4

15) the buffer size of the client request header can be set according to the page size of the system. The size of one request header in general will not exceed 1k. However, since the page size of the system in general is larger than 1k, it is set to the page size here.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
5

The buffer size of the client request header can be set according to the page size of the system. The size of 1 general request header will not exceed 1k. However, since the page size of 1 general system is larger than 1k, it is set to the page size here.
The paging size can be obtained with the command getconf PAGESIZE.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
6

There are cases where client_header_buffer_size exceeds 4k, but client_header_buffer_size must be set to an integral multiple of the system page size.

16) specify the cache for opening the file, which is not enabled by default. max specifies the number of caches, and it is recommended that the number of open files should be 1. inactive refers to how long it takes for the file not to be requested before deleting the cache.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
7

17) the minimum number of times a file is used in open_file_cache instruction. If this number is exceeded, the file descriptor 1 is opened in the cache. In the example above, if a file is not used once in inactive, it will be removed.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
8

18) specify how often to check the cache for valid information once.


worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
9

Here is a simple nginx configuration file I used:


[root@dev-huanqiu ~]# cat /usr/local/nginx/conf/nginx.conf
user  www www;
worker_processes 8;
worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000;
error_log  /www/log/nginx_error.log  crit;
pid     /usr/local/nginx/nginx.pid;
worker_rlimit_nofile 65535;
 
events
{
  use epoll;
  worker_connections 65535;
}
 
http
{
  include    mime.types;
  default_type  application/octet-stream;
 
  charset  utf-8;
 
  server_names_hash_bucket_size 128;
  client_header_buffer_size 2k;
  large_client_header_buffers 4 4k;
  client_max_body_size 8m;
 
  sendfile on;
  tcp_nopush   on;
 
  keepalive_timeout 60;
 
  fastcgi_cache_path /usr/local/nginx/fastcgi_cache levels=1:2
         keys_zone=TEST:10m
         inactive=5m;
  fastcgi_connect_timeout 300;
  fastcgi_send_timeout 300;
  fastcgi_read_timeout 300;
  fastcgi_buffer_size 16k;
  fastcgi_buffers 16 16k;
  fastcgi_busy_buffers_size 16k;
  fastcgi_temp_file_write_size 16k;
  fastcgi_cache TEST;
  fastcgi_cache_valid 200 302 1h;
  fastcgi_cache_valid 301 1d;
  fastcgi_cache_valid any 1m;
  fastcgi_cache_min_uses 1;
  fastcgi_cache_use_stale error timeout invalid_header http_500; 
  open_file_cache max=204800 inactive=20s;
  open_file_cache_min_uses 1;
  open_file_cache_valid 30s; 
 
  tcp_nodelay on;
  
  gzip on;
  gzip_min_length  1k;
  gzip_buffers   4 16k;
  gzip_http_version 1.0;
  gzip_comp_level 2;
  gzip_types    text/plain application/x-javascript text/css application/xml;
  gzip_vary on;
 
  server
  {
   listen    8080;
   server_name  huan.wangshibo.com;
   index index.php index.htm;
   root  /www/html/;
 
   location /status
   {
     stub_status on;
   }
 
   location ~ .*\.(php|php5)?$
   {
     fastcgi_pass 127.0.0.1:9000;
     fastcgi_index index.php;
     include fcgi.conf;
   }
 
   location ~ .*\.(gif|jpg|jpeg|png|bmp|swf|js|css)$
   {
    expires    30d;
   }
 
   log_format  access  '$remote_addr - $remote_user [$time_local] "$request" '
        '$status $body_bytes_sent "$http_referer" '
        '"$http_user_agent" $http_x_forwarded_for';
   access_log  /www/log/access.log  access;
    }
}

2. Several instructions on FastCGI

1) this directive specifies 1 path, directory structure level, keyword area storage time and inactive delete time for FastCGI cache.


fastcgi_cache_path /usr/local/nginx/fastcgi_cache levels=1:2 keys_zone=TEST:10m inactive=5m;

2) specify the timeout time for connecting to the backend FastCGI.


fastcgi_connect_timeout 300;

3) timeout time for sending a request to FastCGI. This value refers to the timeout time for sending a request to FastCGI after two handshakes have been completed.


fastcgi_send_timeout 300;

4) timeout time of receiving FastCGI reply. This value refers to timeout time of receiving FastCGI reply after two handshakes have been completed.


fastcgi_read_timeout 300;

5) specify read FastCGI response to part 1 how much buffer, it can be set to fastcgi_buffers directive specifies the buffer size, the above directive specifies it will use a 16 k buffer to read the answers in part 1, the response header, actually this response header and 1 cases are small (no more than 1 k), but if you are in the middle finger fastcgi_buffers instructions on the size of the buffer, It will also allocate a buffer size specified by fastcgi_buffers to cache.


fastcgi_buffer_size 16k;

6) to specify the local need how much and how much of a buffer to buffer FastCGI response, as shown above, if a php script generated by the page size of 256 k, will be for the allocation of 16 16 k buffer to cache, if more than 256 k, some part of the increase in 256 k will cache the fastcgi_temp specified path, of course it is not wise for the load on the server, because the memory of process data faster than a hard disk, This value is usually set should choose one of your site php script generated by the middle of the page size value, such as your site most script generated by the page size of 256 k can set this value to 16 16 k, 4 k k or 64 or 64, but obviously, the latter two setting method is not good, because if the page is only 32 k, If 4 64k is used, it will allocate 1 64k buffer to cache, if 64 4k is used, it will allocate 8 4k buffers to cache, and if 16 16k is used, it will allocate 2 16k pages to cache, which seems more reasonable.


worker_rlimit_nofile 65535;
6

7) I don't know what this directive is for, except that the default value is twice fastcgi_buffers.


worker_rlimit_nofile 65535;
7

8) what data block will be used when writing fastcgi_temp_path? The default value is twice that of fastcgi_buffers.


worker_rlimit_nofile 65535;
8

9) turn on FastCGI cache and give it a name. Personally, it is very useful to turn on the cache to effectively reduce the load on CPU and prevent 502 errors. But this cache can cause a lot of problems because it caches dynamic pages. Specific use also needs according to their own needs.


worker_rlimit_nofile 65535;
9

10) specify the cache time for the specified response code. In the above example, the 200,302 response will be cached for 1 hour, the 301 response will be cached for 1 day, and the rest will be cached for 1 minute.


fastcgi_cache_valid 200 302 1h;
fastcgi_cache_valid 301 1d;
fastcgi_cache_valid any 1m;

11) the cache is the minimum number of times that the fastcgi_cache_path parameter is used. In the above example, if a file is not used once within 5 minutes, then the file will be removed.


use epoll;
1

12) not knowing what this parameter does, guess it should let nginx know which types of caching are useless.


use epoll;
2

The above are related parameters of FastCGI in nginx,

In addition, FastCGI itself has some configuration needs to be optimized. If you use php-fpm to manage FastCGI, you can modify the following values in the configuration file:

1) the number of concurrent requests processed at the same time, that is, it will open up to 60 child threads to process concurrent connections.


use epoll;
3

2) the maximum number of open files.


use epoll;
4

3) the maximum number of requests that each process can execute before the reset.


use epoll;
5

3. Optimization of kernel parameters, in the file /etc/ sysctl.conf

1) the number of timewait is 180,000 by default. (Deven: so if you want to lower timewait you have to reduce tcp_max_tw_buckets)


use epoll;
6

2) port range allowed to be opened by the system.


use epoll;
7

3) enable TIME-WAIT state sockets quick recovery function; Used to quickly reduce the number of TCP connections in the TIME-WAIT state. 1 means enabled; Zero means off. However, it is important to note that this option 1 is generally not recommended because a large number of connections to TCP (Network Address Translation) will be set up incorrectly on the NAT(Network Address Translation) network, resulting in site access failures.


use epoll;
8

In fact, to enable net.ipv4.tcp_tw_recycle, you need to turn on net.ipv4.tcp_timestamps (which is the default for 1 system).

When tcp_tw_recycle is turned on (tcp_timestamps is turned on at the same time, the effect of quick recovery of socket is achieved), it is a disaster for Client located behind the NAT device!

This will cause the Client connection to the Server behind the NAT device to be unstable (some Client can connect to server, and some Client cannot connect to server).
That is to say, the function tcp_tw_recycle is designed for the internal network (the network environment is under its own control "-- there is no NAT), so it is not suitable for use in the public network environment.

In general, socket in the TIME_WAIT state is reclaimed because it "cannot actively connect to the far end" because no port is available, and memory should not be reclaimed (not necessary).

That is: the requirement is for Client. Will Server have a "lack of ports" problem?

Unless it is a front-end machine, it needs to connect a lot of back-end services, that is, playing the role of Client.

The correct solution to this problem is:


net.ipv4.ip_local_port_range = 9000 6553 # The default range is small 
net.ipv4.tcp_max_tw_buckets = 10000 # The default value is small and can be adjusted appropriately 
net.ipv4.tcp_tw_reuse = 1 
net.ipv4.tcp_fin_timeout = 10 

4) enable reuse, allowing TIME-WAIT state sockets to be reused for new TCP connections. This function is enabled is safe, 1 general do not change!


worker_connections 65535;
0

5) turn on SYN Cookies. When SYN waiting queue overflow occurs, turn on cookies for processing.


worker_connections 65535;
1

6) in the web application, the backlog of listen function will limit the net.core.somaxconn to 128 by default, while the NGX_LISTEN_BACKLOG defined by nginx will be 511 by default, so it is necessary to adjust this value.


worker_connections 65535;
2

7) the maximum number of packets allowed to be sent to the queue when each network interface receives packets at a rate faster than that of the kernel processing these packets.
net.core.netdev_max_backlog = 262144

8) the maximum number of TCP sockets in the system that are not associated with any one user file handle. If this number is exceeded, the orphan connection is immediately reset and a warning message is printed. This restriction is only to prevent a simple DoS attack and should not be relied on too much or artificially reduce the value, but should be increased (if memory is added).


worker_connections 65535;
3

9) the maximum number of connection requests that have not received client confirmation. For systems with 128M memory, the default is 1024, and for systems with small memory, 128.


worker_connections 65535;
4

10) the time stamp can avoid the winding of the serial number. A 1Gbps link is bound to encounter a previously used sequence number. The timestamp allows the kernel to accept this "exception" packet.


worker_connections 65535;
5

In order to improve the performance of many servers, net.ipv4.tcp_tw_recycle option is turned on. In the NAT network environment, it is easy to cause some failures of connect

Personal Suggestions:

Turn off the net.ipv4.tcp_tw_recycle option instead of net.ipv4.tcp_timestamps;

Because when net.ipv4.tcp_timestamps is turned off, opening net.ipv4.tcp_tw_recycle does not work; net.ipv4.tcp_timestamps can be turned on and operated independently.

11) to open the connection to the other end, the kernel needs to send 1 SYN and attach 1 ACK in response to the previous 1 SYN. This is called the second of three handshakes. This setting determines how many SYN+ACK packages are sent before the kernel abandons the connection.


worker_connections 65535;
6

12) the number of SYN packages sent before the kernel gives up establishing the connection.


worker_connections 65535;
7

13) if the socket is turned off by this request, this parameter determines how long it will remain in the FIN-WAIT-2 state. The opposite can go wrong and never close the connection, or even crash accidentally. The default is 60 seconds. The normal value of the 2.2 kernel is 180 seconds, you can press this setting, but keep in mind that even if your machine is a lightweight WEB server, there is a risk of memory overflow due to a large number of dead sockets. FIN-WAIT-2 is less dangerous than FIN-WAIT-1 because it can only consume 1.5K memory at most, but they have a longer lifetime.


worker_connections 65535;
8

14) the frequency at which TCP sends keepalive messages when keepalive is in use. The default is 2 hours.


worker_connections 65535;
9

The following is a standard configuration for one of my common kernel parameters


keepalive_timeout 60;
0

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- remember one little accident -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

net.ipv4.tcp_tw_recycle = 1 does reduce the state of TIME-WAIT, which I normally turn on.

But also because this parameter has stepped on the pit once:

The company's CMS background system for news release adopts the haproxy+keepalived agent architecture, and the back-end real server server external network ip is all removed.

Phenomenon: in the morning of a certain day, at the peak of the messaging peak, there was an access failure in the CMS background. After restarting the php service, it would take effect immediately, but after a period of time, the access failed again.

After checking the logs of nginx and php, nothing was found. After checking the logs of google and php, it was found that the parameter "net.ipv4.tcp_tw_recycle" was the ghost of the parameter "net.ipv4.tcp_tw_recycle".

This network architecture is NAT mode for the back-end realserver. When this parameter is turned on, a large number of TCP connection establishment errors will be caused, which will lead to website access failure.

Finally, set net.ipv4.tcp_tw_recycle to 0. When this function is turned off, the background access immediately returns to normal


Related articles: