Summary of configuration methods for high performance Nginx server optimization

2020-05-10 23:28:07
OfStack

Generally speaking, a well-optimized Nginx Linux server can achieve 500,000-600,000 requests per second, whereas my Nginx server can consistently achieve 904,000 requests per second, and I tested it under high load for more than 12 hours and the server was stable.

It should be noted here that all the configurations listed in this article were validated in my test environment, and you will need to configure them according to your server:

Install Nginx from EPEL source:


yum -y install nginx

Back up the configuration file and configure it to suit your needs:


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf


# This number should be, at maximum, the number of CPU cores on your system.
# (since nginx doesn't benefit from more than one worker per CPU.)
#  The value here cannot exceed  CPU  Total audit, because deployment on a single core exceeds  1  a  Nginx  Service processes do not improve performance. 
worker_processes 24;
 
# Number of file descriptors used for Nginx. This is set in the OS with 'ulimit -n 200000'
# or using /etc/security/limits.conf
# Nginx  The maximum number of available file descriptors and the need to configure the operating system  "ulimit -n 200000" Or, in  /etc/security/limits.conf  In the configuration.  
worker_rlimit_nofile 200000;
 
# only log critical errors
#  Records only  critical  Level of error logging 
error_log /var/log/nginx/error.log crit
 
# Determines how many clients will be served by each worker process.
# (Max clients = worker_connections * worker_processes)
# "Max clients" is also limited by the number of socket connections available on the system (~64k)
#  Configuration of a single  Nginx  The number of clients that a single process can serve, (maximum number of clients)  =  Single process connections  *  Number of processes   ) 
#  The maximum number of clients is also subject to the operating system  socket  The impact of connection number (maximum  64K  ) 
worker_connections 4000;
 
# essential for linux, optmized to serve many clients with each thread
# Linux  A key configuration that allows a single thread to process multiple client requests. 
use epoll;
 
# Accept as many connections as possible, after nginx gets notification about a new connection.
# May flood worker_connections, if that option is set too low.
#  Allows as many connections as possible to be processed if  worker_connections  Too low a configuration can result in a large number of invalid connection requests. 
multi_accept on;
 
# Caches information about open FDs, freqently accessed files.
# Changing this setting, in my environment, brought performance up from 560k req/sec, to 904k req/sec.
# I recommend using some varient of these options, though not the specific values listed below.
#  Cache high frequency operation files FDs (file descriptor / File handle) 
#  In my device environment, by modifying the following configuration, performance from  560k  request / seconds   The ascent to  904k  request / Seconds. 
#  I recommend that you try different combinations of the following configurations instead of using these data directly. 
open_file_cache max=200000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
 
# Buffer log writes to speed up IO, or disable them altogether
#  Write the log to high speed  IO  Storage devices, or simply turn off logging. 
# access_log /var/log/nginx/access.log main buffer=16k;
access_log off;
 
# Sendfile copies data between one FD and other from within the kernel.
# More efficient than read() + write(), since the requires transferring data to and from the user space.
#  open  sendfile  Option to use the kernel  FD  File transfer function, this ratio is used in user mode  read() + write()  Is more efficient. 
sendfile on;
 
# Tcp_nopush causes nginx to attempt to send its HTTP response head in one packet,
# instead of using partial frames. This is useful for prepending headers before calling sendfile,
# or for throughput optimization.
#  Open the  tcp_nopush  Options, Nginux  Allows you to  HTTP  The header of the reply is the same as the data content 1 Out of a message. 
#  This option makes the server in  sendfile  Can be prepared in advance  HTTP  Head, can achieve the optimization of the effect of throughput. 
tcp_nopush on;
 
# don't buffer data-sends (disable Nagle algorithm). Good for sending frequent small bursts of data in real time.
#  Don't cache  data-sends  (closed  Nagle  Algorithm), which can improve the real-time performance of high-frequency sending small datagram. 
tcp_nodelay on;
 
# Timeout for keep-alive connections. Server will close connections after this time.
#  Configure the connection  keep-alive  Timeout, the server will close the connection after the timeout. 
keepalive_timeout 30;
 
# Number of requests a client can make over the keep-alive connection. This is set high for testing.
#  Single client in  keep-alive  The number of requests that can be sent on a connection, in the test environment, needs to be configured with a relatively large value. 
keepalive_requests 100000;
 
# allow the server to close the connection after a client stops responding. Frees up socket-associated memory.
#  Allows the server to close the connection after the client stops sending a reply in order to release the connection accordingly  socket  Memory overhead. 
reset_timedout_connection on;
 
# send the client a "request timed out" if the body is not loaded by this time. Default 60.
#  Configure the client data request timeout. The default is  60  Seconds. 
client_body_timeout 10;
 
# If the client stops reading data, free up the stale client connection after this much time. Default 60.
#  The client side data read timeout configuration, the client side stops reading data, the corresponding connection is disconnected after the timeout, the default is  60  Seconds. 
send_timeout 2;
 
# Compression. Reduces the amount of data that needs to be transferred over the network
#  Compress the parameter configuration to reduce the amount of data transmitted over the network. 
gzip on;
gzip_min_length 10240;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
gzip_disable "MSIE [1-6].";

Start Nginx and configure the machine to load automatically.


service nginx start
  chkconfig nginx on

Configure Tsung and launch the test, which will take about 10 minutes to test the peak capacity of the server, depending on your Tsung configuration.


[root@loadnode1 ~] vim ~/.tsung/tsung.xml


   <server host="YOURWEBSERVER" port="80" type="tcp"/>


tsung start

If you think the test results are enough, exit via ctrl+c and view the test report using the alias command treport that we configured earlier.

WEB server tuning - part 2: TCP stack tuning

This section applies not only to Ngiinx, but also to any WEB server. The bandwidth of the server network can be improved by optimizing the TCP configuration of the kernel.

The following configuration worked perfectly on my 10-Gbase-T server, which went from 8Gbps bandwidth in the default configuration to 9.3Gbps.

Of course, the conclusion may not be the same on your server.

For the configuration items below, I recommend revising only one at a time and then testing the server multiple times using the network performance testing tools netperf, iperf, or my similar test script cluster-netbench.pl.


yum -y install netperf iperf
vim /etc/sysctl.conf


# Increase system IP port limits to allow for more connections
#  Turn up the system  IP  As well as port data limits, from which more connections can be accepted 
net.ipv4.ip_local_port_range = 2000 65000
 
net.ipv4.tcp_window_scaling = 1
 
# number of packets to keep in backlog before the kernel starts dropping them
#  Set the number of messages that can be cached by the protocol stack. Messages exceeding the threshold value will be discarded by the kernel 
net.ipv4.tcp_max_syn_backlog = 3240000
 
# increase socket listen backlog
#  higher  socket  Number of listener thresholds 
net.core.somaxconn = 3240000
net.ipv4.tcp_max_tw_buckets = 1440000
 
# Increase TCP buffer sizes
#  The big  TCP  Storage size 
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_congestion_control = cubic

After each revision of the configuration, the following command needs to be executed to take effect.


sysctl -p /etc/sysctl.conf

Don't forget to do network benchmark testing after configuration revisions so you can see which configuration revisions are most optimized. You can save a lot of time by doing this effective testing.

Common optimization configuration items

1 generally speaking, the following items are useful for optimization in the nginx configuration file:
1. worker_processes 8;
It is recommended to specify the number of nginx processes in terms of the number of cpu, with 1 as its multiple (e.g., 8 for cpu with 2 4 cores).
2. worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
Assign cpu to each process. In the above example, 8 processes are assigned to 8 cpu. Of course, you can write multiple, or 1
Three processes are assigned to multiple cpu.
3. worker_rlimit_nofile 65535;
This directive refers to the maximum number of file descriptors to open when an nginx process is open, and the theoretical value should be the maximum open text
The number of pieces (ulimit-n) is divided by the number of nginx processes, but nginx does not distribute requests evenly, so it is best to keep 1 to the value of ulimit-n.
Now the number of open files in linux 2.6 kernel is 65535, worker_rlimit_nofile should be filled in 65535 accordingly.

This is because nginx scheduling is not so balanced in allocating requests to processes, so if you fill in 10240, when the total concurrency reaches 30,000-40,000, a process may exceed 10240, and a 502 error will be returned.
To view the linux system file descriptor:


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf


fs.file-max = 789972
fs.file-nr = 510 0 789972

4. use epoll;
Use the I/O model of epoll
(
Supplementary notes:
Like apache, nginx has a different event model for different operating systems
A) standard event model
Select and poll belong to the standard event model. If there is no more efficient method for the current system, nginx will choose select or poll
B) efficient event model
Kqueue: FreeBSD 4.1+, OpenBSD 2.9+, NetBSD 2.0 and MacOS X. MacOS X systems with dual processors using kqueue may cause kernel crashes.
Epoll: for use on Linux kernel version 2.6 and later.

/dev/poll: for Solaris 7 11/99+, HP/UX 11.22+ (eventport), IRIX 6.5.15+ and Tru64 UNIX 5.1A+.
Eventport: used with Solaris 10. To prevent kernel crashes, it is necessary to install a security patch.
)
5. worker_connections 65535;
The maximum number of connections allowed per process, theoretically the maximum number of connections per nginx server is worker_processes*worker_connections.
6. keepalive_timeout 60;
keepalive timeout.
7. client_header_buffer_size 4k;
The buffer size of the client request header can be set according to the page size of your system. The size of 1 normal request header will not exceed 1k. However, since the page size of 1 normal system is larger than 1k, it is set to the page size.
The paging size can be obtained with the command getconf PAGESIZE.


[root@web001 ~]# getconf PAGESIZE


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf

There are cases where client_header_buffer_size exceeds 4k, but client_header_buffer_size must be set to an integral multiple of the system page size.
8. open_file_cache max=65535 inactive=60s;
This will specify the cache for opening the file, which is not enabled by default, max specifies the number of caches, the number of Suggestions and opening the file by 1, and inactive refers to how long the file has not been requested before the cache is deleted.
9. open_file_cache_valid 80s;
This refers to how often the cache is checked for valid information once.
10. open_file_cache_min_uses 1;
The minimum number of times a file is used in open_file_cache directive. If this number is exceeded, the file descriptor 1 is opened in the cache. In the example above, if a file is not used once in inactive, it will be removed.

Optimization of kernel parameters:


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf

The number of timewait, by default, is 180,000.


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf

Port range that allows the system to open.


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf

Enable timewait quick recall.


net.ipv4.tcp_tw_reuse = 1

Enable reuse. Allow TIME-WAIT sockets to be reused for a new TCP connection.


net.ipv4.tcp_syncookies = 1

Turn on SYN Cookies and enable cookies for handling when SYN wait queue overflow occurs.


cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig
  vim /etc/nginx/nginx.conf

By default, the backlog of the listen function in the web application limits the kernel parameter net.core.somaxconn to 128, while the NGX_LISTEN_BACKLOG defined by nginx defaults to 511, so it is necessary to adjust this value.


net.core.netdev_max_backlog = 262144

The maximum number of packets allowed to be sent to the queue for each network interface to receive packets at a rate faster than the kernel can process them.


net.ipv4.tcp_max_orphans = 262144

The maximum number of TCP sockets in the system that are not associated with any one user file handle. If this number is exceeded, the orphan connection is immediately reset and a warning message is printed. This restriction is only to prevent a simple DoS attack and should not be relied upon or artificially reduced, but rather increased (if memory is added).


net.ipv4.tcp_max_syn_backlog = 262144

The maximum number of recorded connection requests that have not received client confirmation. For systems with 128M memory, the default is 1024, and for systems with small memory, 128.


net.ipv4.tcp_timestamps = 0

The timestamp avoids the winding of the serial number. A 1Gbps link is bound to encounter a previously used sequence number. The timestamp allows the kernel to accept this "exception" packet. I need to turn it off.


net.ipv4.tcp_synack_retries = 1

To open the connection to the other end, the kernel needs to send 1 SYN along with 1 ACK in response to the previous 1 SYN. This is called the second of three handshakes. This setting determines how many SYN+ACK packages are sent before the kernel abandons the connection.


net.ipv4.tcp_syn_retries = 1

The number of SYN packages sent before the kernel abandons establishing a connection.


net.ipv4.tcp_fin_timeout = 1

If the socket is turned off by this request, this parameter determines how long it will remain in the FIN-WAIT-2 state. The opposite can go wrong and never close the connection, or even crash accidentally. The default is 60 seconds. The normal value of the 2.2 kernel is 180 seconds, 3 you can press this setting, but remember, even if your machine is a lightweight WEB server, there is a risk of memory overflow due to a large number of dead sockets. FIN-WAIT-2 is less dangerous than FIN-WAIT-1 because it can only consume 1.5K memory at most, but they have a longer lifetime.


net.ipv4.tcp_keepalive_time = 30

The frequency at which TCP sends keepalive messages when keepalive is activated. The default is 2 hours.