Nginx USES limit_req_zone to restrict traffic to the same IP access

  • 2020-05-17 07:47:35
  • OfStack

nginx can use the limit_req_zone directive of ngx_http_limit_req_module module to restrict access to the current and prevent users from maleficently attacking the server. The ngx_http_limit_req_module module is installed by default by nginx, so you can configure it directly.

First, configure it under the http module in the nginx.conf file


limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;

Note: the area name is one (custom), the space size is 10m, the average processing frequency of requests should not exceed 1 per second.

$binary_remote_addr is a binary format of $remote_addr (client IP), fixed at 4 bytes (possibly the length of the long type in the C language). While $remote_addr is stored as a string, it takes between 7 and 15 bytes. So it seems that using $binary_remote_addr can save space, but the Internet says that 64-bit systems take up 64 bytes, I don't know, but try to use $binary_remote_addr.

Second, it is configured under server, a submodule of the http module


location ~* .htm$ {
limit_req zone=one burst=5 nodelay;
proxy_pass http://backend_tomcat;
}

Here I am limiting the current for requests with uri suffix htm, notice limit_req zone=one burst=5 nodelay;

Where zone=one corresponds to the previous definition.

The website burst says "peak" or something like that, but I found out by my own experiment that this is not accurate. It should be called the length of the buffer queue.

nodelay literally means no delay, in particular, no delay processing of user-initiated requests, but immediate processing. For example, I defined rate=1r/s above, which means only one request per second is processed. If two requests with the suffix htm arrive at the same time 1, they will be processed immediately if nodelay is set. If nodelay is not set, the configuration of rate=1r/s is strictly followed, i.e., only one request is processed, and then another request is processed the next second. The intuitive view is the page data card, after 1 second to load out.

The two configurations that really work for limiting traffic are rate=1r/s and burst=5. Now let's analyze 1 specific case.

At some point in time, two requests arrive at nginx at the same time, with one being processed and the other placed in the buffered queue. Although configuration nodelay leads to the second request was instant processing, but still occupies the buffer queue length of 1, if there is no request to come under 1 second, it takes burst1 a length of space will be released, otherwise we can only continue to occupy the burst space, space occupied until burst after more than five, then to request will directly be nginx refused, return a 503 error code.

Visible, if again two requests 2 seconds, one request occupies a burst space again, 3 seconds, 4 seconds until 5 seconds, there are two requests per second, although the two requests are handled (because configuration nodelay), but one request still occupies a burst length, length of 5 seconds after the whole burst = 5 were occupied. Two more requests came at six seconds and one of them was rejected.

This is what I inferred from the actual test results, which may be different from the real theory, but I think it is easier to understand. Have clear friend to welcome to inform!

The $binary_remote_addr used here is if there is no proxy layer between the client and nginx. If you configured CDN before nginx, the value of $binary_remote_addr is the IP address of CDN. It's not right to limit the current. The user's real IP needs to be obtained to limit the current.

The brief instructions are as follows:


##  Get the original user's IP address 
map $http_x_forwarded_for $clientRealIp {
"" $remote_addr;
~^(?P<firstAddr>[0-9\.]+),?.*$$firstAddr;
}

##  For the original user  IP  Address restriction 
limit_req_zone $clientRealIp zone=one:10m rate=1r/s;

Similarly, we can use the limit module to restrict the current of the web crawler.

http module


limit_req_zone $anti_spider zone=anti_spider:10m rate=1r/s;

server module


location / {

limit_req zone=anti_spider burst=2 nodelay;
if ($http_user_agent ~* "spider|Googlebot") {
set $anti_spider $http_user_agent;
}

}

Can use curl I - A "Baiduspider" www. remotejob. cn/notice jsp test under 1


Related articles: