Basic Use of Python Multiplexing selector Module

  • 2021-12-12 09:08:10
  • OfStack

Directory 1. IO multiplexing 1.1. epoll, poll, select comparison 2. selector module basic use

1. IO multiplexing

O multiplexing technology uses a middleman who can monitor multiple IO blocks at the same time to monitor these different IO objects. If any one or more IO objects monitored have messages returned, it will trigger the middleman to return these IO objects with messages for obtaining their messages.

The advantage of using IO multiplexing is that a process can also handle multiple IO blocks simultaneously in a single-threaded situation. Compared with the traditional multi-thread/multi-process model, I/O multiplexing system has less overhead, the system does not need to create new processes or threads, and does not need to maintain the operation of these processes and threads, thus reducing the maintenance workload of the system, saving system resources,

Python provides an selector module to implement IO multiplexing. At the same time, on different operating systems, the types of alternatives for this middleman are different. At present, the common ones are epoll, kqueue, devpoll, poll, select and so on; The implementation of kqueue (supported by BSD and mac), devpoll (supported by solaris) and epoll are basically the same. epoll is implemented in Linux 2.5 + kernel, while Windows system only implements select.

1.1. Comparison of epoll, poll and select

select and poll use polling to detect whether all monitored IO have data returned, which requires constantly traversing every IO object, which is a time-consuming operation and inefficient. One of the advantages of poll over select is that select limits the maximum number of monitored IO to 1024, which is obviously not enough for servers that need a large number of network IO connections; poll has no limit on this number. However, this also faces problems. When monitoring these IO by polling, the larger the number of IO, the more time it takes for each polling. The lower the efficiency, which is a problem that polling cannot solve.

epoll was born to solve this problem. First of all, it has no limit on the maximum number of monitored IO, and does not use polling to detect these IO. Instead, it adopts event notification mechanism and callback to obtain these IO objects with message return. Only "active" IO will actively call callback function. This IO will be processed directly without polling.

2. Basic use of selector module


import selectors
import socket

#  Create 1 A socketIO Object, after listening, you will be able to accept the request message 
sock = socket.socket()
sock.bind(("127.0.0.1", 80))
sock.listen()

slt = selectors.DefaultSelector()  #  Use system default selector , Windows For select , linux For epoll
#  Put this socketIO Object is added to the, select Monitor in 
slt.register(fileobj=sock, events=selectors.EVENT_READ, data=None)

#  Cyclic processing of messages 
while True:
    # select Method: Poll this selector When there are at least 1 A IO Object returns a message, this message-bearing object will be returned IO Object 
    ready_events = slt.select(timeout=None)
    print(ready_events)     #  Ready IO Objects 
    break

ready_events Is a list (representing all data receivable IO objects registered in this select), and every 1 tuple in the list is:

SelectorKey Object:

fileobj: Registered socket object fd: File descriptor data: The parameters we passed in during registration can be any value, bound to 1 attribute, which is convenient for later use.

mask value

EVENT_READ: Represents readable; Its value is actually 1; EVENT_WRITE: Represents writable; Its value is actually 2; Or a combination of the two

For example:

[(SelectorKey(fileobj= < socket.socket fd=456, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 80) > , fd=456, events=1, data=None),
1)]

To process this request, you only need to use the corresponding method of socket, which is used to receive the requested connection, and the request can be processed by using the accept method.

When the request is accepted, a new client will be generated, which we put into selector and monitor. When a message comes, if it is a connection request, the handle_request () function handles it, and if it is a client message, the handle_client_msg () function handles it.

There are two types of socket in select, so we need to judge which type of socket is returned after being activated, and then call different functions to make different requests. If there are many kinds of socket in this select, it will be impossible to judge this. The workaround is to bind the handler to the corresponding selectkey object, using the data parameter.


def handle_request(sock:socket.socket, mask):    #  Processing new connections 
    conn, addr = sock.accept()
    conn.setblocking(False)  #  Set non-blocking 
    slt.register(conn, selector.EVENT_READ, data=handle_client_msg)

def handle_client_msg(sock:socket.socket, mask)  #  Processing messages 
    data = sock.recv()
    print(data.decode())

sock = socket.socket()
sock.bind(("127.0.0.1", 80))
sock.listen()

slt = selectors.DefaultSelector()
slt.register(fileobj=sock, events=selectors.EVENT_READ, data=handle_request)

while True:
    ready_events = slt.select(timeout=None)
    for event, mask in ready_events:
        event.data(event.fileobj, mask)
        #  Different socket Have a difference data Function, using the self-bound data Function call, and then put your own socket As a parameter. You can handle different types of socket . 

The above use of data is a good solution to the above problem, but it should be noted that the functions (or callable objects) bound to the data property will eventually be called in the same way as event. data (event. fileobj), and these functions should accept the same parameters.


Related articles: