Detail the nginx request header data reading process

  • 2020-05-24 06:49:43
  • OfStack

In the last article, we showed you how nginx reads the request row data and parses the request row. In this article, we focus on how nginx reads and parses the header data sent by the client. Essentially, the data reading process for both the request row and the request header is essentially 1, because they both face the problem of how to read data from a discontinuous data stream and how to process the data.

1. Request header to read the main flow

Before introducing the read process of the request header, we first show an example of an http request message:


POST /web/book/read HTTP/1.1
Host: localhost
Connection: keep-alive
Content-Length: 365
Accept: application/json, text/plain, */*

The first line of data in the example is the request line, and the next few lines are request headers. Each request header is assembled in name: value format, and each request header takes up one line. In the previous article on the request line reading process, we explained that once the request line is read, nginx changes the callback function for the current read event to ngx_http_process_request_headers() method and calls it directly to try to read the request header data. This method is the main process of reading the request line data, as follows is the source code of this method:


/**
 *  Parsing client sent header data 
 */
static void ngx_http_process_request_headers(ngx_event_t *rev) {
 u_char *p;
 size_t len;
 ssize_t n;
 ngx_int_t rc, rv;
 ngx_table_elt_t *h;
 ngx_connection_t *c;
 ngx_http_header_t *hh;
 ngx_http_request_t *r;
 ngx_http_core_srv_conf_t *cscf;
 ngx_http_core_main_conf_t *cmcf;

 c = rev->data;
 r = c->data;

 if (rev->timedout) {
  ngx_log_error(NGX_LOG_INFO, c->log, NGX_ETIMEDOUT, "client timed out");
  c->timedout = 1;
  ngx_http_close_request(r, NGX_HTTP_REQUEST_TIME_OUT);
  return;
 }

 cmcf = ngx_http_get_module_main_conf(r, ngx_http_core_module);
 rc = NGX_AGAIN;

 for (;;) {
  if (rc == NGX_AGAIN) {
   //  If the current header If there is no space left in the buffer, apply for a new space 
   if (r->header_in->pos == r->header_in->end) {
    //  Apply for new space 
    rv = ngx_http_alloc_large_header_buffer(r, 0);
    if (rv == NGX_ERROR) {
     ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
     return;
    }

    //  Sent by the client header It's too long. It's too long large_client_header_buffers Specified maximum size 
    if (rv == NGX_DECLINED) {
     p = r->header_name_start;
     r->lingering_close = 1;
     if (p == NULL) {
      ngx_log_error(NGX_LOG_INFO, c->log, 0, "client sent too large request");
      ngx_http_finalize_request(r, NGX_HTTP_REQUEST_HEADER_TOO_LARGE);
      return;
     }

     len = r->header_in->end - p;
     if (len > NGX_MAX_ERROR_STR - 300) {
      len = NGX_MAX_ERROR_STR - 300;
     }

     ngx_http_finalize_request(r, NGX_HTTP_REQUEST_HEADER_TOO_LARGE);
     return;
    }
   }

   //  Try to read the newly sent data from the client on the connection 
   n = ngx_http_read_request_header(r);
   if (n == NGX_AGAIN || n == NGX_ERROR) {
    return;
   }
  }

  cscf = ngx_http_get_module_srv_conf(r, ngx_http_core_module);
  //  Here is mainly to read the data to carry on the transformation 
  rc = ngx_http_parse_header_line(r, r->header_in, cscf->underscores_in_headers);

  // NGX_OK It means that it has been resolved successfully 1 a header data 
  if (rc == NGX_OK) {
   r->request_length += r->header_in->pos - r->header_name_start;
   //  Filter invalid header
   if (r->invalid_header && cscf->ignore_invalid_headers) {
    continue;
   }

   //  create 1 A storage header The structure of the body 
   h = ngx_list_push(&r->headers_in.headers);
   if (h == NULL) {
    ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
    return;
   }

   h->hash = r->header_hash;
   //  the header the name As a hash The table key
   h->key.len = r->header_name_end - r->header_name_start;
   h->key.data = r->header_name_start;
   h->key.data[h->key.len] = '\0';

   //  the header the value As a hash The table value
   h->value.len = r->header_end - r->header_start;
   h->value.data = r->header_start;
   h->value.data[h->value.len] = '\0';

   h->lowcase_key = ngx_pnalloc(r->pool, h->key.len);
   if (h->lowcase_key == NULL) {
    ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
    return;
   }

   if (h->key.len == r->lowcase_index) {
    ngx_memcpy(h->lowcase_key, r->lowcase_header, h->key.len);
   } else {
    ngx_strlow(h->lowcase_key, h->key.data, h->key.len);
   }

   // headers_in_hash All of them are stored in header , this is to find the current client pass header Whether it is valid or not header
   hh = ngx_hash_find(&cmcf->headers_in_hash, h->hash, h->lowcase_key, h->key.len);
   //  Here, handler Is in the ngx_http_headers_in For each 1 a header Define the processing method, through each header the 
   // handler() Method is processed by the client header I'm going to convert them all r->headers_in Properties in the structure 
   if (hh && hh->handler(r, h, hh->offset) != NGX_OK) {
    return;
   }

   continue;
  }

  // NGX_HTTP_PARSE_HEADER_DONE Means that all of the header We're done 
  if (rc == NGX_HTTP_PARSE_HEADER_DONE) {
   r->request_length += r->header_in->pos - r->header_name_start;
   r->http_state = NGX_HTTP_PROCESS_REQUEST_STATE;
   //  Check what the client sent header Validity of data 
   rc = ngx_http_process_request_header(r);
   if (rc != NGX_OK) {
    return;
   }

   ngx_http_process_request(r);
   return;
  }

  // NGX_AGAIN Is what was read header The row data is incomplete and needs to be read 
  if (rc == NGX_AGAIN) {
   continue;
  }
  
  ngx_log_error(NGX_LOG_INFO, c->log, 0, "client sent invalid header line");
  ngx_http_finalize_request(r, NGX_HTTP_BAD_REQUEST);
  return;
 }
}

Here, the request header reading is mainly divided into the following steps:

First, check whether the current read event has timed out. If it has, close the current connection directly. Determine if r- > header_in- > pos == r- > header_in- > end is established, which is mainly to check whether there is a memory space in the current read buffer that can store the newly read data. If there is no memory space, a new block of memory space will be applied from the memory pool. Call ngx_http_read_request_header () method reads the data on the handle to the current connection, if the return value is greater than 0, says read into the length of the data, if is equal to zero, has said the client disconnect the connection, if for NGX_ERROR, says read exception happens, if for NGX_AGAIN, did not read the data, this need to continue to read the new data. As you can see, here we first determine whether the return value is NGX_AGAIN, which is directly returned without any other processing. This is mainly because the callback function of the current read event is ngx_http_process_request_headers(). When a new read event is triggered, it will still call ngx_http_read_request_header() to read the data again. On the other hand, in the ngx_http_read_request_header() method, if the return value is found to be NGX_AGAIN, it will add the current read event to the event queue again and register the read event for the current connection on the epoll handle; The call to ngx_http_parse_header_line() method parses the read request header data. It should be noted that each call to this method only parses one request header, but after an infinite for loop and an endless event triggering mechanism, all the request header data will eventually be read. According to the return value of the ngx_http_parse_header_line() method, if it is NGX_OK, the newly read header is stored in r- > headers_in.headers linked list; If the return value of ngx_http_parse_header_line() method is NGX_HTTP_PARSE_HEADER_DONE, it means that all header have been read successfully. At this point, the ngx_http_process_request_header() method is first called to check the validity of the header read. The ngx_http_process_request() method is then called to start the 11 phases of the http module in nginx, the implementation of which will be explained in a later article.

2. Request header data read

As you can see, there are two main methods for reading request headers: ngx_http_read_request_header() and ngx_http_parse_header_line(). The second method here is quite long, but its logic is very simple. The main idea is to parse the data read into a complete request header (name: value in the form of name: value, occupying 1 line). If so, return NGX_OK; otherwise, return NGX_AGAIN in the hope of continuing to read the data. For this method, we do not explain here, readers can read the source code, we mainly explain how ngx_http_read_request_header() method is to read the request header data sent by the client:


static ssize_t ngx_http_read_request_header(ngx_http_request_t *r) {
 ssize_t n;
 ngx_event_t *rev;
 ngx_connection_t *c;
 ngx_http_core_srv_conf_t *cscf;

 c = r->connection;
 rev = c->read;

 //  Calculate how much data is currently pending 
 n = r->header_in->last - r->header_in->pos;

 //  if n Is greater than 0 , indicating that there are still read data not processed, then directly returned n
 if (n > 0) {
  return n;
 }

 //  If you go to here, all the data that you're reading has been processed, so it's going to be judged if the current event ready Parameters for 1 . 
 //  Represents unread data stored on the handle of the current connection and is called c->recv() Method to read the data, otherwise continue adding the current event to 
 //  Event queue, and continue to listen for read events for the current connection handle 
 if (rev->ready) {
  //  Read the data on the connection file descriptor 
  n = c->recv(c, r->header_in->last, r->header_in->end - r->header_in->last);
 } else {
  n = NGX_AGAIN;
 }

 //  if n for NGX_AGAIN , then add the current event to the event listener and continue listening for the current epoll Handle's read event 
 if (n == NGX_AGAIN) {
  if (!rev->timer_set) {
   cscf = ngx_http_get_module_srv_conf(r, ngx_http_core_module);
   ngx_add_timer(rev, cscf->client_header_timeout);
  }

  if (ngx_handle_read_event(rev, 0) != NGX_OK) {
   ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
   return NGX_ERROR;
  }

  return NGX_AGAIN;
 }

 //  if n for 0 , indicating that the client closed the connection 
 if (n == 0) {
  ngx_log_error(NGX_LOG_INFO, c->log, 0, "client prematurely closed connection");
 }

 //  If the client closes the connection or reads an exception, the current one is reclaimed request The structure of the body 
 if (n == 0 || n == NGX_ERROR) {
  c->error = 1;
  c->log->action = "reading client request headers";
  ngx_http_finalize_request(r, NGX_HTTP_BAD_REQUEST);
  return NGX_ERROR;
 }

 //  Update the currently read data pointer 
 r->header_in->last += n;
 return n;
}



Here, the reading of request header data is mainly divided into the following steps:

Determines if there is any unprocessed data in the current buffer, and if so, returns it directly. The main reason for the unread data is that part or all of the request header data may be read in the process of reading the request row data before, so it will be checked here. Determine if the current read event is ready, and if so, call c- > The recv() method reads the data on the current connection handle; If the current read event is not ready, add the current read event to the event queue again and register the read event for the current connection on the epoll handle; Judge the return value of step 2. If it is 0, it means that the client has disconnected; if it is NGX_ERROR, it means that the data read is abnormal. In both cases, the current connection will be closed and 400 status codes will be returned to the client. If the return value is NGX_AGAIN, follow the steps in step 3 to continue listening for the read event. If the return value is greater than 0, the read is successful, and the value greater than 0 represents the length of the data read. Updates pointer data to the buffer that stores the read data.

3. Summary

This article mainly explains how nginx reads and parses the process of request header, and focuses on the main process code of reading data and the detailed steps of reading.


Related articles: