Analysis of Mutual Exclusion Relationship between Python and GIL in Python Programming

  • 2021-11-30 00:42:45
  • OfStack

We know, In CPython, there is a global interpreter lock, called global interpreter lock in English, or GIL for short, which is a mutual exclusion lock, which is used to protect objects in Python world and prevent multiple threads from executing Python bytecode at the same time, thus ensuring thread safety, which leads to the thread of Python unable to take advantage of multi-core CPU. Therefore, some people say that Python's multithreading is pseudo-multithreading and its performance is not high.

To answer this question, first analyze the origin of GIL.

The Origin of GIL

Python was first released in 1991. At that time, CPU was all single-core, In single core, multithreading mainly makes IO for one side. Designed for CPU calculation, Python compiler is written by C language, so it is also called CPython. At that time, many programming languages did not have the function of automatic memory management. In order to realize automatic garbage collection, Python counted references for every 1 object. When the reference count is 0, it shows that the object can be recycled, thus releasing memory, for example:


>>> import sys
>>> data = { 'gzh': 'Python7 No. '}
>>> var1 = data
>>> sys.getrefcount(data)
3
>>> 

Here the data object has three references, One is itself, One is the variable var1, and the other is the parameter of getrefcount function. If another thread references data at this time, the reference count will increase by 1. If a thread uses data and ends running, the reference count will decrease by 1. If multiple threads modify the same variable "reference count", they will encounter race conditions (competition). In order to avoid race conditions,

If every object is locked, another problem may be caused, that is, deadlock, and frequent acquisition and release will lead to performance degradation. The simplest and most effective method is to add an interpreter lock, and the thread acquires the interpreter lock first when executing any bytecode, which avoids deadlock and will not consume too much performance. At that time, CPU was all single-core, and this GIL was simple in design and would not affect performance, so 1 has been used until today. The main reason for the existence of GIL is that the memory management of Python is not thread safe, which is the main reason why GIL came into being and existed.

Try to eliminate GIL

After CPU entered the multi-core era, it can do multiple computing tasks at the same time, and GIL really became a problem. In 1999, a big man named Greg Stein eliminated GIL based on Python version 1.5, replaced it with a finer-grained lock on the variable data structure, and also submitted a patch to remove the dependence on globally variable objects. Then, in the standard test, it was shown that the single thread after removing GIL was nearly twice slower than that without removing it, and the tested machine was the Windows machine with the best performance at that time. That is to say, after removing GIL, you can use two CPU to get a slightly better performance than the original one CPU. This improvement is obviously not worth the loss, and the attempt of Greg Stein ended in failure.

Guido van Rossum, the father of Python, also welcomes community volunteers to try to remove GIL, as long as it does not reduce the performance of single thread, but he also mentioned that removing GIL is not an easy task.

The GIL developer mailing list occasionally has the issue of removing GIL, but the following requirements must be met:

Simple. In the long run, the program must be implementable and maintainable. Concurrent. Removing GIL must improve multithreading performance. Speed. Removing GIL does not degrade single-threaded performance. Meet the characteristics of CPython. The solution must support the features of CPython, such as __del__ And weak references. API compatibility. This scheme should be compatible with all macros used by existing CPython extensions in terms of source. Destroy unreachable objects in time and recycle memory. Orderly destruction. For example, if the unreachable object X refers to A, X should be destroyed before A is destroyed (some garbage collection algorithms cannot do this).

Some requirements are not easily met, such as 4, 5 and 7. At present, no one has successfully removed GIL while meeting the above requirements.

It is difficult to return

In recent years, Python is too hot. Many excellent libraries are written based on CPython, and many of them are C extension libraries in 1990s. If GIL is to be removed, many C extensions written based on GIL cannot be used, that is, GIL is gone, and there are many extensions or three-party libraries in Python ecology that cannot be used.

There is also an obvious example. Python interpreters have not only CPython, but also Python written with Java and IronPython implemented with NET. These interpreters have no GIL at all, but how many people write extensions for them?

The reason why Python is so popular has a lot to do with its rich 3-party library, which is difficult to remove.

Why did Python3 1 not remove GIL at the beginning

Python3 has the opportunity to implement many new functions at the beginning. In the process, it breaks some existing C extensions, and then needs to be updated and migrated to match Python 3, which is why Python3 1 is not accepted by the community at first.

Compared with Python2, removing GIL will make Python3 slower in single-threaded performance, and many excellent extensions will no longer be used. If this is the case, it is conceivable that Python3 will not have a future, and the final result is that Python3 will still have GIL.

However, Python3 has also brought great improvements to the existing GIL. In Python version 3.2, the problem that I/O intensive can't get GIL for a long time and can't execute when computing intensive threads and I/O intensive threads coexist is ensured, which improves the performance of multithreading.

Last words

Python Because memory management is not thread safe, Therefore, it comes with GIL since its birth, and then many extensions are written under the protection of GIL. It is difficult to reverse the product weight after a long time. Python3 1 also retained GIL because of the problem of single-thread performance degradation caused by removing GIL. Now it is Python3.9 version, and it is very unlikely that Python will remove GIL in the future. In other words, Python that removes GIL is not Python as we know it.

But don't be depressed, GIL only affects multithreaded computationally intensive tasks, Most programmers rarely encounter this scenario. Even if there is, they can use multiple processes to avoid the influence of GIL, or use other programming languages. Any programming language or technology is not 10-to-10-beauty, and it is most important to give full play to their strengths. Even if there is GIL, I don't care, and I will still use Python.

The above is the Python and GIL mutex relationship analysis details, more information about Python and GIL mutex please pay attention to other related articles on this site!


Related articles: