On PHP Analytical URL Functions parse_url and parse_str

  • 2021-07-26 07:05:50
  • OfStack

There are two methods in PHP that can be used to resolve URL, parse_url and parse_str.

parse_url
Parse URL and return its components

mixed parse_url ( string $url [, int $component = -1 ] )

This function parses an URL and returns an associative array containing the various components that appear in URL.

This function is not used to verify the validity of a given URL, but to break it down into the parts listed below. Incomplete URL is also accepted, and parse_url () will try to parse it as correctly as possible.

Parameter

url the URL to parse. Invalid characters will be replaced with _.

component specifies one of PHP_URL_SCHEME, PHP_URL_HOST, PHP_URL_PORT, PHP_URL, PHP_URL_PASS, PHP_URL_PATH, PHP_URL_QUERY or PHP_URL to obtain string for the specified portion in URL. (One value of integer is returned except when PHP_URL_PORT is specified).

Return value

For seriously nonconforming URL, parse_url () may return FALSE.

If the component parameter is omitted, one associative array array is returned, in which at least one element is currently present. There are several possible keys in the array:

scheme-eg http
host
port
user
pass
path
query-in question mark? After
fragment-After hash symbol
If the component parameter is specified, parse_url () returns 1 string (or 1 integer when PHP_URL_PORT is specified) instead of array. If the component specified in URL does not exist, NULL will be returned.

Instances


<?php
$url = 'http://username:password@hostname/path?arg=value#anchor';
print_r(parse_url($url));
echo parse_url($url, PHP_URL_PATH);
?>

The above routine outputs:


Array
(
    [scheme] => http
    [host] => hostname
    [user] => username
    [pass] => password
    [path] => /path
    [query] => arg=value
    [fragment] => anchor
)
/path

parse_str

Parse a string into multiple variables

void parse_str ( string $str [, array & $arr ] )

If str is the query string passed in by URL (query string), it is parsed as a variable and set to the current scope.

To get the current QUERY_STRING, you can use the $_SERVER ['QUERY_STRING'] variable.

Parameter

str the string entered.

arr If the second variable arr is set, the variable will be stored in the array as an array element instead. ,

Instances


<?php
$str = "first=value&arr[]=foo+bar&arr[]=baz";
parse_str($str);
echo $first;  // value
echo $arr[0]; // foo bar
echo $arr[1]; // baz
parse_str($str, $output);
echo $output['first'];  // value
echo $output['arr'][0]; // foo bar
echo $output['arr'][1]; // baz
?>

The first period of time in reading php-resque source code, saw in which the two methods of application, feeling very good, used to parse the redis link settings.

The format of redis link is: redis://user: pass @ host: port/db? option1=val1 & option2 = val2 is not like URL1, so it is easy to analyze with the above two methods.

Address: https://github.com/chrisboulton/php-resque/blob/master/lib/Resque/Redis.php

The code is as follows:


    /**
     * Parse a DSN string, which can have one of the following formats:
     *
     * - host:port
     * - redis://user:pass@host:port/db?option1=val1&option2=val2
     * - tcp://user:pass@host:port/db?option1=val1&option2=val2
     *
     * Note: the 'user' part of the DSN is not used.
     *
     * @param string $dsn A DSN string
     * @return array An array of DSN compotnents, with 'false' values for any unknown components. e.g.
     *               [host, port, db, user, pass, options]
     */
    public static function parseDsn($dsn)
    {
        if ($dsn == '') {
            // Use a sensible default for an empty DNS string
            $dsn = 'redis://' . self::DEFAULT_HOST;
        }
        $parts = parse_url($dsn);
        // Check the URI scheme
        $validSchemes = array('redis', 'tcp');
        if (isset($parts['scheme']) && ! in_array($parts['scheme'], $validSchemes)) {
            throw new \InvalidArgumentException("Invalid DSN. Supported schemes are " . implode(', ', $validSchemes));
        }
        // Allow simple 'hostname' format, which `parse_url` treats as a path, not host.
        if ( ! isset($parts['host']) && isset($parts['path'])) {
            $parts['host'] = $parts['path'];
            unset($parts['path']);
        }
        // Extract the port number as an integer
        $port = isset($parts['port']) ? intval($parts['port']) : self::DEFAULT_PORT;
        // Get the database from the 'path' part of the URI
        $database = false;
        if (isset($parts['path'])) {
            // Strip non-digit chars from path
            $database = intval(preg_replace('/[^0-9]/', '', $parts['path']));
        }
        // Extract any 'user' and 'pass' values
        $user = isset($parts['user']) ? $parts['user'] : false;
        $pass = isset($parts['pass']) ? $parts['pass'] : false;
        // Convert the query string into an associative array
        $options = array();
        if (isset($parts['query'])) {
            // Parse the query string into an array
            parse_str($parts['query'], $options);
        }
        return array(
            $parts['host'],
            $port,
            $database,
            $user,
            $pass,
            $options,
        );
    }

The above is my personal understanding of php's analysis of URL functions parse_url and parse_str, which are recorded here and shared with you, hoping to be helpful to small partners


Related articles: