Understand the implementation of Nginx's location matching

  • 2020-05-24 06:51:28
  • OfStack

As the team is separating the front end from the front end, the front end takes over the Nginx and node layers. In the daily work, I often deal with Nginx. location is the most used and modified part. Previously, the matching rule of location was 1 known half solution. In order to understand how location is matched, I spent some time looking up some data and summarized this article. I hope I can help you.

Grammar rules


location [ = | ~ | ~* | ^~ ] uri { ... }
location @name { ... }

The syntax rule is simple: 1 location keyword followed by an optional modifier, followed by the character to match, and curly braces for the action to be performed.

The modifier

= means exact match. A hit occurs only if the requested url path is exactly the same as the following string. ~ indicates that the rule is defined using the regular definition, case - sensitive. ~* indicates that the rule is defined using a regular, case-insensitive. ^~ means that if the character following the symbol is the best match, this rule is adopted and no further lookups are performed.

Matching process

Serialize the url of the request. For example, decode characters such as %xx, remove multiple connected/in url, and parse url. And so on. This step is the matching front work.

location can be represented in two ways, one using the prefix character and one using the regular character. If it is regular, it is preceded by ~ or ~* modifiers.

The specific matching process is as follows:

First, check the location defined with the prefix character, select the item with the longest match and record it.

If an exact match is found for location, location with the = modifier, close the search and use its configuration.

It then looks for location using the regular definition in order, and if it matches, it stops looking and USES the configuration it defines.

If there is no matching regular location, the longest matching prefix character location recorded previously is used.

Based on the above matching process, we can get the following two enlightenments:

The order in which location appears in the configuration file using the regular definition is important. Because the first matching regex is found, the search stops, and the regex defined later has no chance to match again. Using exact matches can improve the speed of lookups. For example, if you frequently request /, you can define location using =.

The sample
Let's take an example to illustrate the matching process.

Suppose we have the following configuration file:


location = / {
  [ configuration A ]
}

location / {
  [ configuration B ]
}

location /user/ {
  [ configuration C ]
}

location ^~ /images/ {
  [ configuration D ]
}

location ~* \.(gif|jpg|jpeg)$ {
  [ configuration E ]
}

Request/exact match A, no more searching.

Request/index.html matches B. First look for the matched prefix characters, find the longest match to configure B, and then look for the matched regularness in order. The result was not found, so the longest match of the previous tag is used, that is, B is configured.

Request /user/ index.html match C. The longest match C is first found, and since there is no matching regex, the longest match C is used.

Request /user/ 1.jpg match E. First, the prefix character is searched to find the longest match C, and then the regular search is continued to find the match E. So E is used.

Request /images/ 1.jpg matches D. First, do a prefix character lookup to find the longest D match. In particular, however, it USES the ^~ modifier and instead of doing the subsequent regular match lookup, D is used. Here, without the previous modifier, the actual final match is E. You can think about 1 and think about why.

Request /documents/ about.html matches B. Because B means that any URL that starts with/matches. In the above configuration, only B is satisfied, so match B.

The use of location @name

@ is used to define a name location. Mainly for internal redirection, not for handling normal requests. Its usage is as follows:


location / {
  try_files $uri $uri/ @custom
}
location @custom {
  # ...do something
}

In the above example, when we tried to access url and could not find a corresponding file, we redirected to our custom named location (custom in this case).

It is important to note that no other name location can be nested within the name location.

URL tail/required

There are 3 points about the tail/of URL and 1 need to be explained. Point 1 has to do with the location configuration, not the other two.

It doesn't matter whether the characters in location have/or not. So /user/ and /user are the same.

If the URL structure is https: / / domain com/form, tail ever/never redirected. Because when the browser makes the request, it adds/by default. Many browsers don't display/in the address bar, though. For this 1 point, you can access baidu to verify 1 below.

If the structure is https URL: / / domain com/some dir /. A tail missing/will cause a redirect. Because by convention, the/at the end of URL represents a directory and no/represents a file. So when you go to/some-dir /, the server will automatically go to that directory and look for the default file. If you access/some-dir, the server will first go to the some-dir file. If it cannot find some-dir, it will use some-dir as a directory, redirect to/some-dir /, and go to the default file in that directory. Test 1 to see if your site looks like this.

conclusion

The configuration of location comes in two forms, the prefix character and the regular. When looking for a match, look for the prefix character, select the longest match, and then look for the regular. Regex has precedence over prefix characters.

Regular and other lookups are performed in the order in the configuration file. Therefore, regular and equal order is very important, it is recommended that the more elaborate the more forward.

Using = exact match can speed up the search order, if the root domain is often visited, it is recommended to use =.


Related articles: