python Multi thread and Multi process Relation Detailed Explanation

  • 2021-08-21 20:52:46
  • OfStack

A general explanation of multithreading:

In the standard library of Python, two modules are given: _ thread and threading. _ thread is a low-level module that does not support daemon threads. When the main thread exits, all sub-threads will be forced to exit. threading is an advanced module, which is used to encapsulate _ thread and support daemon threads. In most cases, people only need to adopt threading as an advanced module.

A general explanation of multi-process:

Multi-process is the concurrency between remote and local given by multiprocessing module. In the case of an multiprocessing library, all the child processes are run by a parent process, and the parent process becomes an madter process, which will manage a series of object states. If this process exits, the child process is likely to be in an unstable state, so the parent process should do less to maintain its stability as far as possible

So the difference between python multithreading and multiprocess is as follows:

In multithreading, the process number of all sub-threads is 1; In multi-process, the child process number of not 1 sample is not 1 sample

Threads share memory space; The memory of the process is independent

Multi-threads can share global variables, but multi-processes cannot

The threads of the same process can communicate directly; If two processes want to communicate, they must use an intermediate agent to realize it

One thread can control and operate other threads in the same process; But a process can only operate on child processes

Creating a new thread is simple; Creating a new process requires 1 clone of its parent process

Content extension:

python Multi-process, Multi-thread Connection and Difference

Process

Concept: A process is the smallest unit of resources allocated by the operating system

Understanding: An application program includes at least one process, and each process has an independent memory unit during execution. The multi-process programming of python mainly depends on multiprocess module.

Interprocess communication: First of all, processes can communicate with each other. There are 7 common communication methods, such as pipeline pipe, named pipeline FIFO, message queue MessageQueue, shared memory SharedMemory, semaphore Semaphore, socket Socket and signal signal (due to the actual development, only a brief introduction is made here). However, processes are usually independent of each other, and each process has independent memory. Through shared memory (nmap module), objects can be shared between processes, so that multiple processes can access the same variable (same address, variable name may be different). Sharing resources among multiple processes will inevitably lead to competition among processes, so the use of shared state should be prevented as much as possible.

Thread

Concept: Threads are the smallest unit of operating system scheduling

Understanding: A process includes one or more threads, which occupy less resources and are more efficient. Moreover, multiple threads of a process share memory during execution, so there is no doubt that threads can communicate with each other. The multi-process programming of python mainly depends on threading module

Communication between threads: There are mainly two ways, one is locking, and the other is mutual exclusion to ensure that only one thread has the ability to modify global variables at any time. The other is to use message queue, which is the classic producer and consumer model. One is responsible for generation and the other is responsible for consumption. The generated products are stored in queue, which realizes communication between different threads.

The difference between multi-process and multi-thread

Because the creation and destruction of processes involve the allocation and recovery of system resources, the overhead of multi-process is obviously greater than that of multi-thread.

Application scenarios of both

For CPU-intensive code (such as loop computing)-multi-process efficiency is higher

For IO-intensive code (such as file manipulation, web crawler)-multithreading is more efficient

Reason: For IO intensive operations, most of the time consumed is actually waiting time. In the waiting time, CPU does not need to work, so you can't make use of double CPU resources during this period. On the contrary, for CPU intensive codes, two CPU work much faster than one CPU. So why is multithreading useful for IO-intensive code? At this time, because python meets the wait, it will release GIL for new threads to use, thus realizing the switching between threads.


Related articles: