PHP kernel exploration: Variable storage and type usage instructions

  • 2020-12-20 03:31:15
  • OfStack

Let's answer the question in the first verse.


<?php
    $foo = 10;
    $bar = 20;

    function change() {
        global $foo;
        //echo ' Function of the internal $foo = '.$foo.'<br />';
        // If you don't $bar Is defined as global Variables are not accessible in the body of the function $bar the 
        $bar = 0;
        $foo++;
    }

    change();
    echo $foo, ' ', $bar;
?>

Program output 11 20. The reason is that the $bar variable cannot be accessed inside the method, so its value is still 20. After using global, the value of $foo can be obtained, and the value of $foo after the increment is 11.
The purpose of Global is to define a global variable, but this global variable is not applied to the entire web site, but to the current page, including all files in include or require.
The introduction mentioned three basic characteristics of variables, one of which is the type of variables, variables have a specific type, such as: string, array, object, and so on. The type system of programming languages can be divided into strong and weak types:
Strongly typed languages are those in which once a variable is declared as a variable of a certain type, it should not be assigned a value other than that of the variable's type during the program operation (although this is not entirely true, it may involve conversion of the type, as described in the following section). Languages such as C/C++/Java fall into this category.
Scripting languages such as PHP and Ruby and JavaScript are weakly typed languages: 1 variable can represent any data type.
A large part of what makes PHP a simple and powerful language is that it has weakly typed variables. But sometimes this is also a double-edged sword, improper use will also bring a number of problems. As with instrument 1, the more powerful it is, the more likely it is that something will go wrong.
Within the official PHP implementation, all variables are stored using the same data structure (zval) that represents the various data types in PHP. It contains not only the value of the variable, but also the type of the variable. This is the heart of the PHP weak type.
So how does the zval structure implement the weak type? Let's go ahead and unveil it.
Variable storage structure
PHP does not need to explicitly indicate the data type of a variable when it is declared or used.
PHP is a weakly typed language, which does not mean that PHP has no type. In PHP, there are eight variable types, which can be divided into three types
* Scalar types: boolean, integer, float(double), string
* Compound types: array, object
* Special types: resource, NULL
The official PHP is implemented using C, while C is a strongly typed language. How does this implement the weak type in PHP?
The value of the variable is stored in the zval structure shown below. The zval structure is defined in the Zend/ zend.h file, and its structure is as follows:


typedef struct _zval_struct zval;
...
struct _zval_struct {
    /* Variable information */
    zvalue_value value; /* value */
    zend_uint refcount__gc;
    zend_uchar type; /* active type */
    zend_uchar is_ref__gc;
};

PHP uses this structure to store all the data for the variable. Unlike other compiled static languages, PHP also stores the variable types of PHP user space in the same structure when storing variables. So we can use this information to get the type of the variable.
There are four fields in the STRUCTURE of zval, whose meanings are as follows:

The property name meaning The default value refcount__gc Reference count 1 is_ref__gc Indicates whether it is a reference 0 value Store the value of the variable type After es80EN5.3, the new garbage collection mechanism is introduced, with the reference count and referenced field names changed to refcount___, gc, and is_refES85en. Before that, refcount and is___ ES88en.

The value of the variable is stored in another structure, zvalue_value. The value store is described below.
PHP userspace refers to the level 1 of the PHP language, and most of this book is devoted to the implementation of PHP. These implementations can be understood as kernel space. Since PHP is implemented using C, the scope of this space is limited to the C language. PHP user space is limited to the scope provided by PHP syntax and functionality. For example, some PHP extensions provide 1 PHP function or class, which exports a method or class to PHP user space.
Variable types
The type field of the zval structure is the most critical field to implement the weak type. The type values can be: IS_NULL, IS_BOOL, IS_LONG, IS_DOUBLE, IS_STRING, IS_ARRAY, IS_OBJECT and IS_RESOURCE 1. They are literally the only notation for the type, storing different values into the value field depending on the type. In addition, the types they define at 1 are IS_CONSTANT and IS_CONSTANT_ARRAY.
This is similar to what we did when we designed our database. To avoid redesigning similar tables, we used one identifier field to record different types of data.

The value store of a variable
The value of the variable mentioned earlier is stored in the zvalue_value union. The structure is defined as follows:


typedef union _zvalue_value {
    long lval; /* long value */
    double dval; /* double value */
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht; /* hash table value */
    zend_object_value obj;
} zvalue_value;

The use of unions instead of structs here is for space utilization because a variable can only be of one type at a time. Using structures would unnecessarily waste space, whereas all the logic in PHP revolves around variables, so the memory waste would be 10 cents. The costs are small but the benefits are huge.
Various types of data will use different methods to store variable values, and the corresponding assignment method is as follows:

1. 1 General type

Variable types The macro ? boolean ZVAL_BOOL The value of a Boolean/integer variable is stored in (zval).value.lval, and its type is also stored in the corresponding IS_*.
Z_TYPE_P(z)=IS_BOOL/LONG; Z_LVAL_P(z)=((b)!=0); integer ZVAL_LONG float ZVAL_DOUBLE null ZVAL_NULL The variable value of the NULL value does not need to be stored, just marked (zval).type as IS_NULL.
Z_TYPE_P(z)=IS_NULL; resource ZVAL_RESOURCE The storage of the resource type is the same as that of any other variable, but its initialization and access implementation are different.
Z_TYPE_P(z) = IS_RESOURCE; Z_LVAL_P(z) = l;
2. String Sting
The string is typedef like any other data type, except that the string is stored with an additional field of string length.


struct {
    char *val;
    int len;
} str;

The C string is an array of characters ending in \0, which stores the length of the string, much like the redundant fields we added when designing our database. Because the time complexity to get the length of the string in real time is O(n), and string operation is very frequent in PHP, this can avoid double-counting the length of the string, which can save a lot of time, it is space for time. Look at it this way in PHP the strlen() function gets the length of the string in constant time. There are so many operations on strings in computer languages that most high-level languages store the length of the string.

3. The array Array

Arrays are the most common and powerful variable type in PHP, can store other types of data, and provide a variety of built-in manipulation functions. The storage of the array is more complex than other variables. The value of the array is stored in the es195EN_value.ht field, which is a data of type HashTable. PHP's arrays use hash tables to store associated data. A hash table is an efficient key-value pair storage structure. Two data structures, HashTable and Bucket, are used in the hash table implementation of PHP. All the work of PHP is implemented by hash tables. The following section of HashTable introduces the basic concepts of hash tables and the implementation of PHP's hash tables.

4. Object object

In object-oriented languages, we can define our own data types, including class properties, methods and other data. An object is a concrete implementation of a class. Objects have their own state and what they can do.
The object of PHP is a compound data, which is stored using a structure of zend_object_value. Its definition is as follows:


typedef struct _zend_object_value {
    zend_object_handle handle; // unsigned int Type, EG(objects_store).object_buckets The index of the 
    zend_object_handlers *handlers;
} zend_object_value;

PHP objects are created only at run time. The previous section describes the EG macro, which is a global structure used to hold data at run time. This includes the object pool where all objects are created, EG(objects_store), the zend_object_handle field for the value content of the object object is the index of the current object in the object pool, and the handlers field is where the handler is stored when the object is being operated on. The structure _zend_class_entry of this structure and its object-related classes is described later.
PHP's weak variable container is implemented in the form of a compatible package, with a corresponding tag and storage space for each type of variable. Using strongly typed languages is usually more efficient than using weakly typed languages because much of the information can be determined before it is run, which also helps to troubleshoot program errors. The problem with this is that writing code is relatively restrictive.

The primary use of PHP is as the Web development language, and the bottleneck in common Web applications is usually at the business and data access layer 1. But language is also a key factor in large applications. facebook therefore uses its own php implementation. Compile PHP to C++ code to improve performance. However, hiphop of facebook is not a complete implementation of php. Since it directly compiles php to C++, some dynamic features of PHP such as eval structure cannot be implemented. Of course, there is a way to implement it, and hiphop's non-implementation should be a tradeoff.


Related articles: