Principle and Usage Summary of GIL in python

  • 2021-09-16 07:35:48
  • OfStack

1. Description

GIL specifies that one Python interpreter can only be controlled by one thread at a time.

In CPU restricted type and multithreaded code, GIL is a performance bottleneck.

GIL makes Python multithreading pseudo-parallel multithreading.

GIL exists only on the CPython interpreter.

2. Principle

(1) Threads 1, 2 and 3 take turns to execute, and every thread locks GIL to prevent other threads from executing;

Similarly, GIL is released after each thread executes 1 segment, allowing other threads to start using resources.

(2) Because of the old GIL mechanism, if Thread 2 needs to execute on CPU2, it needs to wait for Thread 1 executing on CPU1 to release GIL (remember: GIL is global)

(3) If Thread 1 is giving up GIL because of i/o blocking, then Thread 2 must get GIL. But if Thread 1 gives up GIL because timer ticks counts 100 ticks (roughly 1000 bytecodes), then Thread 1 and Thread 2 compete fairly at this time.

(4) But the trouble is that in Python 2. x, Thread 1 will not dynamically adjust its priority, so it is very likely that Thread 1 will be selected to execute next time. In many such election cycles, Thread 2 can only quietly watch Thread 1 execute happily on CPU 1 with GIL.

(5) In the extreme case of 1 point, for example, thread 1 uses while True to execute on CPU1, it is really "1 core is in trouble, 8 cores are watching".

Extension of knowledge points:

Design Concept and Limitation of GIL

The code execution of python is controlled by python virtual machine (also called interpreter main loop, CPython version), and python was designed with only one thread running in the interpreter main loop at the beginning. That is, only one thread runs in the interpreter at any time. The control of python virtual machine access is controlled by the global interpretation lock GIL, which controls that only one thread can run at the same time.

When calling external code (such as C, C + + extension function), GIL is locked until the end of this function (no thread switching will be done because there is no bytecode running for python during this period).

In python, all the threads used are operating system-level threads, while in linux, pthread and window use their native threads.

From the above overview, it can be intuitively seen that py can only run one thread at the same time, so in the case of running multiple threads, it can only run after the thread acquires the global interpreter lock, and there is only one global interpreter lock, so it can only play the function of single core even in the case of multiple cores.

So it seems that py is not powerful, and GIL directly leads to CPython not being able to accelerate its operation by using the performance of physical multi-core. So why is there such a design? Considering that when Guido van Rossum created python, multi-core cpu was completely unimaginable in 1990s. Now, due to the rapid development of hardware, programming should consider exhausting all the performance of cpu, otherwise it will be eliminated, so the same should be true for python.

The above mainly talks about the disadvantages of this design, and then discusses its advantages.

The design of GIL simplifies the implementation of CPython, making the object model, including key built-in types such as dictionaries, implicitly concurrent. Locking the global interpreter makes it easier to support multithreading, but it also impairs the parallel computing ability of multiprocessor hosts.

But both standard and third-party extensions are designed to release the GIL for intensive computing tasks. In addition, when doing IO operation, GIL is always released. For all programs facing the built-in operating system C code, GIL is released before the IO call, allowing other threads to run while waiting for the IO. If it is a purely computational program, without IO operation, the interpreter will release GIL every 100 times or every 1 fixed time 15ms.

It can be understood here that IO-intensive python can take advantage of the convenience brought by multithreaded environment better than compute-intensive programs.


Related articles: