win2003 is the solution to the apache crash problem under PHP using preg_match_all

  • 2020-05-09 19:50:36
  • OfStack

The platform of this site is windows server 2003 (32-bit system) + Apache/2.2.9 (Win32) + PHP/5.2.17, using the regular expression preg_match_all (preg_match_all("/ni(.*?)). wo/", $html, $matches);) The Apache server will crash and restart automatically when a long string $html (more than 100,000 bytes, 1 normally used to analyze the collected web source code) is analyzed and matched.
      has this note in the Apache error log:

[Thu Apr 11 18:31:31 2013] [notice] Parent: child process exited with status 128 -- Restarting.
[Thu Apr 11 18:31:31 2013] [notice] Apache/2.2.9 (Win32) PHP/5.2.17 configured -- resuming normal operations
[Thu Apr 11 18:31:31 2013] [notice] Server built: Jun 13 2008 04:04:59
[Thu Apr 11 18:31:31 2013] [notice] Parent: Created child process 2964
[Thu Apr 11 18:31:31 2013] [notice] Disabled use of AcceptEx() WinSock2 API
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Child process is running
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Acquired the start mutex.
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Starting 350 worker threads.
[Thu Apr 11 18:31:31 2013] [notice] Child 2964: Listening on port 80.

      found that when win used regular preg_match_all or preg_match to analyze a long string, apache crashed and restarted because ThreadStackSize's default allocated thread stack space on windows was too small. The default value for win32 is only 256KB, while the default value for linux is 8M, which is why the same program works on linux but not on win.
According to the official instructions of PCRE library: the size of pcre.recursion_limit corresponding to the stack space of 256 KB should not exceed 524.
Here is a table of safe values of pcre.recursion_limit for a variety of executable stack sizes:
The following is the recommended safety value for Stacksize and pcre.recursion_limit, beyond which stack overflow is very likely to occur. apache crash:
Stacksize   pcre.recursion_limit
 64 MB      134217
 32 MB      67108
 16 MB      33554
  8 MB      16777
  4 MB      8388
  2 MB      4194
  1 MB      2097
512 KB      1048
256 KB      524

If you do not resize the stack, you must add it at the beginning of the PHP page using regular:

<?php
ini_set("pcre.recursion_limit", "524"); // PHP default is 100,000.
?>

To see the specific error, use the following code:

$resultsArray = preg_match_all("/table.*?<a>/isU", $html, $contents);
if ($resultsArray === 0){
echo get_pcre_err();
}
function get_pcre_err(){
        $pcre_err = preg_last_error();  // PHP 5.2 and above.
        if ($pcre_err === PREG_NO_ERROR) {
            $msg = 'Successful non-match.';
        } else {
            // preg_match error!
            switch ($pcre_err) {
                case PREG_INTERNAL_ERROR:
                    $msg = 'PREG_INTERNAL_ERROR';
                    break;
                case PREG_BACKTRACK_LIMIT_ERROR:
                    $msg = 'PREG_BACKTRACK_LIMIT_ERROR';
                    break;
                case PREG_RECURSION_LIMIT_ERROR:
                    $msg = 'PREG_RECURSION_LIMIT_ERROR';
                    break;
                case PREG_BAD_UTF8_ERROR:
                    $msg = 'PREG_BAD_UTF8_ERROR';
                    break;
                case PREG_BAD_UTF8_OFFSET_ERROR:
                    $msg = 'PREG_BAD_UTF8_OFFSET_ERROR';
                    break;
                default:
                    $msg = 'Unrecognized PREG error';
                    break;
            }
        }
    return($msg);
}

For the regular modifier isU:

i:  said in-casesensitive Case insensitive 
s: PCRE_DOTALL , indicating that the dot can match the newline character.
U: said PCRE_UNGREEDY Greedy, greedy, greedy, greedy, greedy, greedy perl/python The language of the .*? , in the matching process, for .* Regular, 1 There are matches that are executed immediately instead of waiting .* Searched all the characters again 11 return

      when using regular expressions, we should try to avoid recursive calls, which can easily lead to stack overflow. Such as:
/ < table((?! < table).)*? < \/a > /isU will make an error and use / < table.*? < \/a > /i is normal.

How to increase the size of ThreadStackSize under win platform? Enable "Include conf/extra/ httpd.conf" in the apache configuration file httpd.conf (remove the preceding comment #), and then "ThreadStackSize 8400000" in the mpm_winnt_module configuration module in httpd-mpm.conf (approximately 8M).

<IfModule mpm_winnt_module>
    ThreadStackSize 8400000
    ThreadsPerChild      200
    MaxRequestsPerChild    10000
    Win32DisableAcceptEx
</IfModule>

It is important to note here that the 32-bit Apache program can only use about 2GB memory space at most! Therefore, the value of ThreadStackSize multiplied by ThreadsPerChild (8M * 200) should not exceed 2G, otherwise apache cannot be started. The error log is as follows:
[Thu Apr 11 20:02:45 2013] [crit] (OS 8) Insufficient storage to process this command.   : Child 4832: _beginthreadex failed. Unable to create all worker threads. Created 212 of the 220 threads requested with the ThreadsPerChild configuration directive.

What this site can tell you from the tip above is that on my server, when the thread stack size is set to 8M, I can set the maximum number of threads to 212.


Related articles: