Dig into the cause of the problem caused by default values for arguments to Python functions

  • 2020-04-02 14:44:56
  • OfStack

Will be introduced in this paper using a mutable object as a Python function parameters default value potential harm, as well as its implementation principle and design purpose
Trap to reproduce

Let's use a practical example to illustrate what we're going to talk about today.

The following code defines a function called generate_new_list_with. This function is meant to create a new list containing the given element value each time it is called. The actual operation results are as follows:
 


Python 2.7.9 (default, Dec 19 2014, 06:05:48)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def generate_new_list_with(my_list=[], element=None):
...   my_list.append(element)
...   return my_list
...
>>> list_1 = generate_new_list_with(element=1)
>>> list_1
[1]
>>> list_2 = generate_new_list_with(element=2)
>>> list_2
[1, 2]
>>>

We can see that the code does not run as we expected. Instead of getting a new list and filling in a 2 on the second call to the function, list_2 appends a 2 based on the result of the first call. Why does this happen in other programming languages where it's almost like a design bug?
Preparation knowledge: the essence of Python variables

To understand the cause of this problem, we need a preparatory knowledge: how exactly are Python variables implemented?

Python variables are different from other programming languages in the way of declaration & assignment, which is implemented by creating & pointing in a similar way to a pointer. That is, a variable in Python is actually a pointer to a value or object (they are simply worthy of a name). Let's look at an example.
 


p = 1
p = p+1

For traditional languages, the above code would be executed by declaring a variable for p in memory, and then storing the 1 in the memory of the variable p. When the addition operation is performed, the result of 2 is obtained, and the value 2 is again stored in the memory address where p is located. It can be seen that during the whole execution process, what changes is the value on the memory address where the variable p is located

In this code, Python is actually now executing an object in memory that creates a 1 and points a p to it. When you add, you actually get a new object of 2 by adding, and you point p to the new object. It can be seen that the memory address pointed to by p changes throughout the execution
The root cause of the function parameter default value trap

In one sentence: the default values for arguments to Python functions are bound at compile time.

Now, let's examine the reasons for this trap in detail from an excerpt. Here is an excerpt from (link: http://http//docs.python-guide.org/en/latest/writing/gotchas/) reasons:

Python's default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.

It can be seen that if the parameter default value is determined at the compile stage of the function. When all subsequent function calls are made, if the parameter is not given a value that is displayed, the so-called default value of the parameter is simply a pointer to the object that existed at compile. If the function is called without showing the value of the specified incoming parameter. Then in all cases the parameter will exist as an alias for the object created at compile time.

If the default value of the parameter is an immutable (Imuttable) value, then if the parameter is modified in the function body, the parameter will be redirected to a new immutable value. If the parameter defaults to a Muttable, as in the example at the beginning of this article, then the situation is worse. All changes to this parameter in the body of a function are actually changes to the object that were already identified at the compile stage.

For such a trap in the (link: https://docs.python.org/3/tutorial/controlflow.html#more-on-defining-functions) have special tips:

Important warning: The default value is evaluated only once. This top service difference when The default is a mutable object to The as a list, dictionary, or instances of most classes. For example, The following function accumulates the arguments passed to it on subsequent calls:
How to avoid this trap causing unnecessary trouble

Of course, it's best not to use mutable objects as function defaults. If you must, here's one solution. Again, take the requirements at the beginning of the article:
 


def generate_new_list_with(my_list=None, element=None):
  if my_list is None:
    my_list = []
  my_list.append(element)
  return my_list

Why does Python design this way

The answer to the question on the (link: http://stackoverflow.com/questions/1132941/least-astonishment-in-python-the-mutable-default-argument) to find the answer. Here are the most important excerpts from the answers that got the most votes:

Later, this is not a design flaw, and it is not because of internals, or performance.

It comes simply from the fact that functions in Python are first-class objects, and not only a piece of code.

As soon As you get to think into this way, then it completely makes sense: a function is an object being evaluated on its definition; Default parameters are kind of "member data" and therefore their state may change from one call to the other exactly as in any other object.

In any case, Effbot has a very nice explanation of the reasons for this behavior In Default Parameter Values In Python.

I found it very clear, and I really suggest reading it for a better knowledge of how function objects work.

In this answer, the answerer assumes that the function is an internal first-level object for Python compiler implementation reasons. The default value of the parameter is the property of the object. In any other language, object properties are bound at object creation time. Therefore, it is not surprising that the default values of function parameters are bound at compile time.
However, there are many other respondents who are not convinced that even first-class objects can be bound at execution time using a closure.

This is not a design flaw. It is a design decision. Perhaps a bad one, but not an accident. The state thing is just like any other closure: a closure is not a function, and a function with mutable default argument is not a function.

There are even refutes to put aside the implementation of logic, purely from the point of view of design: as long as it is against the basic thinking of the ape logic behavior, is a design defect! Here are some of their arguments:

> Sorry, but anything considered "The biggest WTF in Python" is most definitely a design flaw. This is a source of bugs for everyone at some point, Because no one thinks that behavior at first means it should not have been designed that way to begin with.

The phrases "this is not generally what was intended" and "a way around this is" smell like they 'r e documenting a design flaw.

Well, in this case, the answer to this question would have been a mystery all along without some clarification from the Python author himself.


Related articles: