Detailed Explanation of gzcompress gzdeflate and gzencode Functions in PHP

  • 2021-07-10 18:58:57
  • OfStack

There is a set of compression and decompression functions that look very similar in PHP:

Compression function: gzcompress gzdeflate gzencode

Extraction function: gzuncompress gzinflate gzdecode

gzdecode was added after PHP 5.4. 0, so pay attention to compatibility when using it.

These functions all start with gz, which makes people think of gzip compression. However, looking at the function names alone can't see the difference between them, so we can only look up the documents.

The gzcompress gzdeflate gzencode functions differ in the format of the data they compress:

gzcompress uses the ZLIB format;

gzdeflate uses the pure DEFLATE format;

gzencode uses the GZIP format;

However, there is one point that is the same. They all use DEFLATE compression algorithm when compressing data (in theory, ZLIB and GZIP formats can use other compression algorithms, but only DEFLATE algorithm is used in practice at present). ZLIB and GZIP only add some headers and tails on the basis of DEFLATE.

Incidentally, Content-Encoding in the HTTP protocol: deflate uses the ZLIB format instead of the pure DEFLATE format.

Beginning with PHP 5.4. 0, the gzcompress and gzdeflate functions add a third parameter, $encoding, which can be three constants:

ZLIB_ENCODING_RAW corresponds to pure DEFLATE format;

ZLIB_ENCODING_GZIP corresponds to the GZIP format;

ZLIB_ENCODING_DEFLATE corresponds to the ZLIB format (note that it is not a pure DEFLATE format);

Although not mentioned in the documentation, these three constants can also be used in the third parameter $encoding_mode of the gzencode function.

In fact, starting from PHP 5.4. 0, these three functions are one kind, but the default value of the third parameter is different; If the third parameter is passed in when calling, the data returned by these three functions is the same. You can write a simple script test:


<?php
$url = 'http://ofstack.com';
$s1 = gzdeflate($url, 1);
$s2 = gzencode($url, 1, ZLIB_ENCODING_RAW);
if (strcmp($s1, $s2) == 0) echo 'the same';
?>

Running shows that $s1 and $s2 are the same. Why? You can find the answer from the source code of PHP. Open php-5. 5.4\ ext\ zip\ zlib.c, and you can find this code:



#define PHP_ZLIB_ENCODE_FUNC(name, default_encoding) \
static PHP_FUNCTION(name) \
{ \
    char *in_buf, *out_buf; \
    int in_len; \
    size_t out_len; \
    long level = -1; \
    long encoding = default_encoding; \
    if (default_encoding) { \
        if (SUCCESS != zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|ll", &in_buf, &in_len, &level, &encoding)) { \
            return; \
        } \
    } else { \
        if (SUCCESS != zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "sl|l", &in_buf, &in_len, &encoding, &level)) { \
            return; \
        } \
    } \
    if (level < -1 || level > 9) { \
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "compression level (%ld) must be within -1..9", level); \
        RETURN_FALSE; \
    } \
    switch (encoding) { \
        case PHP_ZLIB_ENCODING_RAW: \
        case PHP_ZLIB_ENCODING_GZIP: \
        case PHP_ZLIB_ENCODING_DEFLATE: \
            break; \
        default: \
            php_error_docref(NULL TSRMLS_CC, E_WARNING, "encoding mode must be either ZLIB_ENCODING_RAW, ZLIB_ENCODING_GZIP or ZLIB_ENCODING_DEFLATE"); \
            RETURN_FALSE; \
    } \
    if (SUCCESS != php_zlib_encode(in_buf, in_len, &out_buf, &out_len, encoding, level TSRMLS_CC)) { \
        RETURN_FALSE; \
    } \
    RETURN_STRINGL(out_buf, out_len, 0); \
}
/* NOTE: The naming of these userland functions was quite unlucky */
/* {{{ proto binary gzdeflate(binary data[, int level = -1[, int encoding = ZLIB_ENCODING_RAW])
   Encode data with the raw deflate encoding */
PHP_ZLIB_ENCODE_FUNC(gzdeflate, PHP_ZLIB_ENCODING_RAW);
/* }}} */ /* {{{ proto binary gzencode(binary data[, int level = -1[, int encoding = ZLIB_ENCODING_GZIP])
   Encode data with the gzip encoding */
PHP_ZLIB_ENCODE_FUNC(gzencode, PHP_ZLIB_ENCODING_GZIP);
/* }}} */ /* {{{ proto binary gzcompress(binary data[, int level = -1[, int encoding = ZLIB_ENCODING_DEFLATE])
   Encode data with the zlib encoding */
PHP_ZLIB_ENCODE_FUNC(gzcompress, PHP_ZLIB_ENCODING_DEFLATE);
/* }}} */

As you can see, the gzdeflate gzencode gzcompress functions are all defined with the same PHP_ZLIB_ENCODE_FUNC macro So of course they are the same.

The comments in the code also admit that these functions are not named well, and it is not known why they are named like this.


Related articles: