Analysis of variable parameters and default parameters in C and C++

  • 2020-04-02 01:33:01
  • OfStack

Be aware that C does not support default parameters

C/C++ supports the function definition of variable parameter number, which is related to the pushing order of C/C++ language function parameter call. First, a paragraph of text from other netizens is quoted to describe the function call and parameter push:

-- the quote begins --
C supports functions with variable arguments, which means that C supports functions with a variable number of arguments. The most common example is the familiar printf() series of functions. We also know that arguments are pushed from right to left when a function is called. If the general form of the variable parameter function is:
      F (p1, p2, p3,...).
Then the order of parameter push (and push) is:
      ...
      Push p3
      Push the p2
      Push p1
      Call f
      Pop p1
      Pop p2
      Pop p3
      ...
I can conclude that if functions with variable arguments are supported, the order in which the arguments are pushed must almost certainly be right-to-left. Also, the argument stack should not be done by the function itself, but by the caller.

The second half of this conclusion is not hard to understand, because the function itself does not know how many arguments are passed in by the caller, but the caller knows, so the caller is responsible for pushing all the arguments out of the stack.

In the general form of a variable-parameter function, the left side is the determined parameter and the right ellipsis represents the unknown parameter part. For an already determined parameter, its position on the stack must also be determined. Otherwise, it means that the parameters that have been determined cannot be located and found, so there is no guarantee that the function will execute correctly. Measuring the position of the parameter on the stack is how far away from the exact function call point (call f). Determined parameters, its position on the stack, should not depend on the specific number of parameters, because the number of parameters is unknown!

Therefore, the choice can only be, the parameters have been determined, from the function call point has a certain distance (close). This condition is satisfied, only the parameter push obeys the right-to-left rule. That is, the left side of the argument is pushed at a certain distance from the function call point (the leftmost argument is pushed last, closest to the function call point).

This way, when the function starts executing, it can find all the parameters that have been determined. By its own logic, the function is responsible for finding and interpreting the variable arguments that follow (at a distance from the call point), often depending on the values of the parameters that have been determined (typical format interpretation of a function such as prinf(), which is unfortunately fragile).

It is said that in PASCAL the parameters are pushed from left to right, contrary to C. For a language like PASCAL that only supports functions with fixed parameters, it does not have the problem of variable parameters. Therefore, it can choose any kind of parameter push mode.
Even so, its argument stack is done by the function itself, not the caller, because the type and number of arguments to the function are completely known. This approach is more efficient than the C approach because it takes less code (in C, the argument stack code is generated every time the function is called).

C++ still supports functions with variable arguments in order to be compatible with C. But a better choice in C++ is often function overloading.
-- end of quote --

As described above, we can verify this by looking at the definitions of functions such as printf() and sprintf() :
_CRTIMP int s _cdecl printf(const char *,...) ;
_CRTIMP int s _cdecl sprintf(char *, const char *,...) ;

When these two functions are defined, both of them use the s/s keyword.
The caller is responsible for clearing the call stack, the arguments are passed through the stack, and the push order is from right to left.

Next, let's take a look at how a function like printf() USES variable arguments. Here's an excerpt from the MSDN example.
Only references to the ANSI system compatibility part of the code, UNIX system code please refer to MSDN directly.

-- example code --


#include <stdio.h>
#include <stdarg.h>
int average( int first, ... );
void main( void )
{
   printf( "Average is: %d/n", average( 2, 3, 4, -1 ) );
}
int average( int first, ... )
{
   int count = 0, sum = 0, i = first;
   va_list marker;
   va_start( marker, first );     
   while( i != -1 )
   {
      sum += i;
      count++;
      i = va_arg( marker, int);
   }
   va_end( marker );              
   return( sum ? (sum / count) : 0 );
}

The function of the above example code is to calculate the average. The function allows the user to enter multiple integer parameters, and requires the latter parameter to be -1, which means that the parameter input is completed, and then returns the average calculation result.

The logic is simple. Define it first
    Va_list marker;
Represents the parameter list, and then calls va_start() to initialize the parameter list. Note that the va_start() call is not just using marker
This parameter list variable also USES the parameter first, which indicates that the initialization of the parameter list is related to the first determined parameter given by the function. This is crucial, and the subsequent analysis will see why.

After calling va_start() to initialize, you can call the va_arg() function to access the parameters in each parameter list. Note that va_arg ()
The second parameter of the return value specifies the type (int).

When the program determines that all parameter access is complete, the va_end() function is called to end the parameter list access.

So it looks like it's easy to access the variable number argument, which is va_list,va_start(),va_arg(),va_end()
Such a type is associated with three functions. However, the mechanism of changing the number of parameters of the function is still confused. It seems that we need to go further to find out the exact answer.

Find the definitions of va_list,va_start(),va_arg(),va_end() in... / vc98/include /stdarg.h.
The code in h is as follows (only excerpts of the ANSI compatibility part of the code, UNIX and other systems implementation is slightly different, interested friends can study) :


typedef char *  va_list;
#define _INTSIZEOF(n)   ( (sizeof(n) + sizeof(int) - 1) & ~(sizeof(int) - 1) )
#define va_start(ap,v)  ( ap = (va_list)&v + _INTSIZEOF(v) )
#define va_arg(ap,t)    ( *(t *)((ap += _INTSIZEOF(t)) - _INTSIZEOF(t)) )
#define va_end(ap)      ( ap = (va_list)0 )

As you can see from the code, va_list is just a type escape, which is defined as a pointer of type char* to access memory in bytes.
The other three functions are really just three macro definitions, but wait a minute, let's look at the macro definition in the middle _INTSIZEOF:

# define _INTSIZEOF (n)     ((sizeof (n) + sizeof (int) - 1) & ~ (sizeof (int) - 1))

The function of this macro is to calculate the length (size) of a given variable or type n after byte alignment by integer byte length. Int takes 4 bytes in a 32-bit system and 2 bytes in a 16-bit system.
expression
  Sizeof of n plus sizeof of int minus 1.
If sizeof(n) is less than sizeof(int), then the result is calculated
The resulting value of sizeof(n) will go one bit to the left of the binary value.

E.g. Sizeof (short) + sizeof(n) -1 = 5
The binary of 5 is 0x00000101, and the binary of sizeof(short) is 0x00000010, so the binary value of 5 is greater than the binary value of 2
One higher to the left.

expression
  1) ~ (sizeof (int)
A mask is generated to omit the "odd" part of the calculated value.
As in the example above, ~(sizeof(int) -1) = 0x00000011(0xFFFFFF00 here, thanks for the glietboys reminder)
The "and" operation with binary 0x00000101 of 5 results in 0x00000100, which is 4, while a direct calculation of sizeof(short) should result in 2.
In this way, with an expression such as _INTSIZEOF(short), you can get the lengths of other types of bytes that are aligned according to the integer byte length.
The reason why the byte length of int is used for alignment is that the pointer variable in C/C++ is actually an integer value of the same length as an int, and the offset of the pointer is required for the operation of the next three macros.

For information on byte alignment in programming, please refer to other articles on the web.

Continue with the following three macro definitions:

The first:
# define va_start (ap, v)   (ap = (va_list)&v + _INTSIZEOF(v))

This is used in programming
    Va_list marker;
    Va_start (marker, first);
It can be seen that the function of the va_start macro is to make a given parameter list pointer (marker) offset backward according to the pointer length of the type of the first parameter (first), and the previous _INTSIZEOF(n) macro is used to calculate this offset.

The second:
# define va_arg (ap, t)       (*(t *)((ap += _INTSIZEOF(t)) - _INTSIZEOF(t)))

At first glance, it seems a little confusing that (ap += _INTSIZEOF(t)) -_intsizeof (t) expression of one plus one minus, does not work for the return value, that is, the return value is all the value of ap, why?
The original return value of this calculation is on the one hand, on the other hand, remember that the calls to va_start(),va_arg(),va_end are related, and the variable ap is the given parameter list pointer when va_start() is called, so

(ap += _INTSIZEOF(t)) - _INTSIZEOF(t)

The expression is not only to return the address of the parameter currently pointed to, but also to point the ap to the next parameter (note that the ap hops to the next parameter, which is calculated in terms of the length of _INTSIZEOF of type t).

The third:
# define va_end (ap)           (ap = (va_list)0)

That makes sense, but I'm just going to leave the ap pointer empty as an argument read.

At this point, the mechanism of C/C++ variable function parameters is very clear. One final note:
There is no way to tell if the resulting next pointer is a valid address, and no place to tell exactly how many arguments are to be read, in the process of reading arguments with the va_arg() sequential jump pointer. In the previous averaging example, the informant was required to provide a special value (-1) at the end of the argument list to indicate the end of the argument list, so it can be assumed that if the informant did not follow this rule, the pointer access would overstep the bounds.

So, some of you might ask, well, the printf() function doesn't provide such a special value for identification.

Don't worry, printf() USES a different way of identifying the number of arguments, which may be more subtle. Notice that his first parameter determination, which is the format string that we use for format control, has a parameter descriptor such as "%d" and "%s" in it. When the printf() function parses the format string, it can determine that it needs to read the following parameters according to the number of parameter descriptors. Here's an experiment:

Printf (" % d, % d, % d, % d/n ", 1, 2, 3, 4, 5);

The actual number of arguments provided is more than the previously given parameter descriptor, and the result of this execution is

1, 2, 3, 4

So printf() is the format string that says there are only four arguments, and we don't care about the rest. So let's do another experiment:

Printf (" % d, % d, % d, % d/n ", 1, 2, 3);

The actual number of arguments supplied is less than the given parameter descriptor, and the result of this execution is (if there is no exception)

1,2,3,2367460

At this point, everyone's execution may be different, because the pointer to the last parameter read is already pointing to an invalid address. This is where the use of functions like printf() requires special attention.

Conclusion:
Variable number of function parameters in the use of the need to pay more attention to the place. My personal advice is to avoid using this model as much as possible. For example, when calculating the average, you would rather use an array or some other list as an argument to pass a series of values to a function than write such a perversion. On the one hand, it is easy for pointer access to overstep the bounds, on the other hand, in the actual function call, all the calculated values in turn as parameters in the code, is dirty.

Having said that, there are some places where this is useful, such as the formatted composition of strings, such as the printf() function; In practice, I often use a self-written WriteLog() function, which is used to log files. The definition is the same as printf(), which is very flexible and convenient to use, such as:

WriteLog(" user %s, login times %d","guanzhong",10);

That's what it says in the document

User guanzhong, login times 10

The use of programming languages, under the premise of following the basic rules, is a matter of opinion. In a word, after thorough understanding, choose a good habit that accords with oneself can


Related articles: