Explain the scope of index variables in Python under the for loop

  • 2020-05-10 18:22:09
  • OfStack

Let's start with one test. What does this function do?
 


def foo(lst):
  a = 0
  for i in lst:
    a += i
  b = 1
  for t in lst:
    b *= i
  return a, b

If you think its function is to "calculate the sum and product of all the elements in lst," don't get discouraged. It is often difficult to spot errors here. If you find this error in a lot of real code, it's pretty cool. It's hard to spot this error when you don't know it's a test.

The error here is to use i instead of t in the second body of the loop. Wait, how does this work? i should not be visible outside the first loop. [1] oh, no. In fact, Python officially states that the name defined for the for loop target (loop target) (the more strictly formal name is "index variable") can leak into the peripheral function scope. So the following code:
 


for i in [1, 2, 3]:
  pass
print(i)

This code is valid enough to print out a 3. In this article, I want to explore why this is the case, why it's unlikely to change, and use it as a tracking bullet to mine some interesting parts of the CPython editor.

By the way, if you don't believe that this behavior can cause real problems, consider this code snippet:
 


def foo():
  lst = []
  for i in range(4):
    lst.append(lambda: i)
  print([f() for f in lst])

If you expect the code above to print out [0,1,2,3], it will print out [3,3,3,3]. Since there is only one i in the scope of foo, this i is what all lambda capture.
The official instructions

This behavior is explicitly documented in the for loop section of the Python reference documentation:

The       for loop assigns a variable to the target list. ... When the loop ends, the variables in the assignment list will not be deleted, but if the sequence is empty, they will not be assigned to all loops.

Pay attention to the last sentence, let's try it:
 


for i in []:
  pass
print(i)

Indeed, the code above throws an NameError exception. Later, we will see that this is an inevitable consequence of the way the Python virtual machine performs bytecode.
Why is that

I actually asked Guido van Rossum about the reasons for this executive action, and he was generous enough to give me some historical background (thanks Guido!). . The incentive to execute the code this way is to keep Python's variables and scopes simple, without resorting to hacks (such as removing all variables defined in the loop after the loop is complete -- think of the exceptions it might throw) or more complex scoping rules.

Python's scoping rules are simple and elegant: modules, classes, and code blocks of functions can introduce scopes. Within a function, variables are visible from their definition to the end of a code block, including nested code blocks such as nested functions. Of course, the rules are slightly different for local variables, global variables (and other nonlocal variables). However, this has little to do with our discussion.

The most important point here is that the innermost possible scope is a function body. It's not an for cycle. It's not a block of with code. Unlike other programming languages (such as C and its descendants), Python has no nested lexical scope at the functional level.

So, if you're just based on the Python implementation, your code might end up doing this. Here's another inspiring code snippet:
 


for i in range(4):
  d = i * 2
print(d)

Are you surprised to find that the variable d is visible and accessible at the end of the for loop? No, that's how Python works. So why are the scopes of index variables treated differently?

By the way, the index variable in the list derivation (list comprehension) also leaks into its closed scope, or, more precisely, before Python 3.

Python 3 contains a number of significant changes, including a fix for variable leakage in the list derivation. There is no doubt that this breaks backward compatibility neutrality. That's why I don't think the current executive behavior will change.

In addition, many people still find this to be a useful feature in Python. Consider the following code:
 


for i, item in enumerate(somegenerator()):
  dostuffwith(i, item)
print('The loop executed {0} times!'.format(i+1))

If you don't know the number of items returned by somegenerator, you can use this succinct approach. Otherwise, you must have a separate counter.

Here's another example:
 


for i in somegenerator():
  if isinteresing(i):
   break
dostuffwith(i)

This pattern effectively looks for an item in a loop and USES it later. [2]

Over the years, many users have wanted to keep this feature. But it's hard to introduce major changes even for features that developers consider harmful. This is especially true when many people find this feature useful and heavily used in real-world code.
Under the hood

Now for the most interesting part. Let's take a look at how the Python compiler and VM work together to make this code execution behavior possible. In this particular case, I think the clearest way to present this is to reverse-engineer from bytecode. I'd like to use this example to show you how to mine the information inside Python [3] (it's so much fun!). .

Let's look at part 1 of the function presented at the beginning of this article:
 


def foo(lst):
  a = 0
  for i in lst:
    a += i
  return a

The bytecode generated is:
 


 0 LOAD_CONST        1 (0)
 3 STORE_FAST        1 (a)
 
 6 SETUP_LOOP       24 (to 33)
 9 LOAD_FAST        0 (lst)
12 GET_ITER
13 FOR_ITER        16 (to 32)
16 STORE_FAST        2 (i)
 
19 LOAD_FAST        1 (a)
22 LOAD_FAST        2 (i)
25 INPLACE_ADD
26 STORE_FAST        1 (a)
29 JUMP_ABSOLUTE      13
32 POP_BLOCK
 
33 LOAD_FAST        1 (a)
36 RETURN_VALUE

As a hint, LOAD_FAST and STORE_FAST are bytecode (opcode), which Python USES to access variables that are used only in functions. Since the Python compiler knows (at compile time) how many of these static variables are in each function, they can be accessed via a static array offset instead of a hash table, which makes the access faster (hence the _FAST suffix). I digress a little bit. What really matters here is that the variables a and i are treated equally. They are all retrieved via LOAD_FAST and modified via STORE_FAST. There is absolutely no reason to think that their visibility is different. [4]

So, how does this enforcement phenomenon happen? Why does the compiler think the variable i is just a local variable in foo? This logic is in the code in the symbol table, and when the compiler executes to AST to create a control flow diagram, bytecode is generated. More details of this process are covered in my article on symbol tables -- so I'll just mention the main points here.

The symbol table code does not consider the for statement special. In symtable_visit_stmt there is the following code:
 


case For_kind:
  VISIT(st, expr, s->v.For.target);
  VISIT(st, expr, s->v.For.iter);
  VISIT_SEQ(st, stmt, s->v.For.body);
  if (s->v.For.orelse)
    VISIT_SEQ(st, stmt, s->v.For.orelse);
  break;

The index variable is accessed like any other expression 1. Since this code accesses AST, it's worth looking inside the statement nodes of for:
 


for i in [1, 2, 3]:
  pass
print(i)
0

So i is in a node called Name. These are handled by the symbol table code through the following statements in symtable_visit_expr:
 


for i in [1, 2, 3]:
  pass
print(i)
1

Since the variable i is clearly marked DEF_LOCAL (because * _FAST bytecode is accessible, but it is also easy to observe that the symtable module is used if the symbol table is not available), the obvious code above calls symtable_add_def and DEF_LOCAL as the third argument. Now go to AST at the top and notice ctx=Store at Name node. Therefore, it is the AST that stores the information of i in the target part of the node For. Let's see how this works.

The AST build in the compiler goes beyond the parse tree (which is a fairly low-level representation of the source code -- some background information is available here), and among other things, sets the expr_context property at certain nodes, most notably the Name node. Think about it like this: 1. In the following statement:
 


for i in [1, 2, 3]:
  pass
print(i)
2

Both variables for and bar will end at the node Name. But bar is just loaded into this code, and for is actually stored into this code. The expr_context property is used by the symbol table code to distinguish between current and future use [5].

Go back to the index variable of our for loop. This is processed in the function ast_for_for_stmt -- for statement creation AST --. Here's the relevant part of the function:
 


static stmt_ty
ast_for_for_stmt(struct compiling *c, const node *n)
{
  asdl_seq *_target, *seq = NULL, *suite_seq;
  expr_ty expression;
  expr_ty target, first;
 
  /* ... */
 
  node_target = CHILD(n, 1);
  _target = ast_for_exprlist(c, node_target, Store);
  if (!_target)
    return NULL;
  /* Check the # of children rather than the length of _target, since
    for x, in ... has 1 element in _target, but still requires a Tuple. */
  first = (expr_ty)asdl_seq_GET(_target, 0);
  if (NCH(node_target) == 1)
    target = first;
  else
    target = Tuple(_target, Store, first->lineno, first->col_offset, c->c_arena);
 
  /* ... */
 
  return For(target, expression, suite_seq, seq, LINENO(n), n->n_col_offset,
        c->c_arena);
}

The Store context is created when the function ast_for_exprlist is called, which creates a node for the index variable (note that the index variable of the for loop may also be a tuple of a sequence variable, not just a variable).

This function is the last and most important part of the explanation of why the for loop variable and the other variables in the loop 1 are the same. After marking in AST, the code used to handle loop variables in the symbol table and virtual machine is the same as the code used to handle other variables.
conclusion

This article discusses some specific behaviors in Python that might be considered "problematic". I hope this article does explain the code execution behavior of Python's variables and scopes, why these behaviors are useful and never likely to change, and how the internals of the Python compiler make it work properly. Thank you for reading!

[1] here, I'd love to crack an Microsoft Visual C ++ 6 joke, but the truth is a bit disturbing, as most readers of this blog won't get the joke in 2015 (a reflection of my age, not the abilities of my readers).

[2] you might say that dowithstuff(i) enters if before it reaches break. However, this is not always convenient. In addition, as explained by Guido, there is a good separation of concerns - the loop is used and only used for searching. What happens to the variables in the loop after the search is over is no longer the concern of the loop. I think that's a very good point.

[3]: usually the code in my articles is based on Python 3. Specifically, I look forward to completing the default branch for the next release (3.5) of the Python library. But for this particular topic, any version of the source code in the 3.x series should work.

[4] another obvious thing in the function decomposition is why i is still not visible if the loop does not execute, GET_ITER and FOR_ITER this pair of bytecodes treats our loop as an iterator and then calls its s 280en__ method. If the call ends with an StopIteration exception, the virtual machine catches the exception and ends the loop. Only if the actual value is returned will the virtual machine continue to perform STORE_FAST on i, so let this value exist so that subsequent code can reference it.

[5] this is a strange design, and I suspect the essence of this design is to use relatively clean recursion to access the code in AST, such as the symbol table code and the CFG generator.


Related articles: