In depth understanding of the select module in python
- 2020-05-30 20:27:24
- OfStack
Introduction to the
The select module in Python focuses on I/O multiplexing, providing select poll epoll3 methods (the latter two are available in Linux, windows only supports select), and kqueue method (freeBSD system)
select method
The process specifies which events of which file descriptors (up to 1024 fd) the kernel is listening for. When no file descriptor events occur, the process is blocked. The process is awakened when one or more file descriptor events occur.
When we call select() :
1. Context switch to kernel mode
2. Copy fd from user space to kernel space
3. The kernel traverses all fd to see if the corresponding event has occurred
4. If it does not happen, the process will be blocked. When the device driver generates an interrupt or timeout time, the process will be awakened and traversed again
5. Return fd after traversal
6. Copy fd from kernel space to user space
fd:file descriptor file descriptor
fd_r_list, fd_w_list, fd_e_list = select.select(rlist, wlist, xlist, [timeout])
Parameters: 4 parameters are acceptable (the first 3 are required)
Return value: 3 lists
The select method is used to monitor file descriptors (select blocks when file descriptor conditions are not met) and returns three lists when a file descriptor state changes
1. When fd in the sequence of parameter 1 meets the "readable" condition, the changed fd is obtained and added to fd_r_list
2. When fd is included in the sequence of parameter 2, all fd in the sequence is added to fd_w_list
3. When an error occurs in fd in the parameter 3 sequence, the error fd is added to fd_e_list
4. When the timeout time is empty, select will block 1 until the handle of the listener changes
When timeout = n(positive integer), select blocks n seconds if none of the listening handles have changed, and then returns three empty lists. If the listening handles have changed, it executes directly.
Example: implement a concurrent server using select
import socket
import select
s = socket.socket()
s.bind(('127.0.0.1',8888))
s.listen(5)
r_list = [s,]
num = 0
while True:
rl, wl, error = select.select(r_list,[],[],10)
num+=1
print('counts is %s'%num)
print("rl's length is %s"%len(rl))
for fd in rl:
if fd == s:
conn, addr = fd.accept()
r_list.append(conn)
msg = conn.recv(200)
conn.sendall(('first----%s'%conn.fileno()).encode())
else:
try:
msg = fd.recv(200)
fd.sendall('second'.encode())
except ConnectionAbortedError:
r_list.remove(fd)
s.close()
import socket
flag = 1
s = socket.socket()
s.connect(('127.0.0.1',8888))
while flag:
input_msg = input('input>>>')
if input_msg == '0':
break
s.sendall(input_msg.encode())
msg = s.recv(1024)
print(msg.decode())
s.close()
On the server side we can see that we need to keep calling select, which means:
1 when there are too many file descriptors, copy between user space and kernel space can be time-consuming
2 when there are too many file descriptors, the kernel's traversal of the file descriptors is also a waste of time
3 select supports a maximum of 1024 file descriptors
The differences between poll and select are not large and will not be covered in this article
epoll method:
epoll is a good improvement on select:
1. The solution of epoll is in the epoll_ctl function. Each time a new event is registered into the epoll handle, all fd is copied into the kernel, instead of being copied repeatedly while epoll_wait is registered. epoll guarantees that each fd is copied only once during the entire process.
When epoll_ctl, epoll will iterate over the specified fd once (which is necessary once) and specify a callback function for each fd. When the device is ready to wake up the waiters on the waiting queue, this callback function will be called, and this callback function will add the ready fd to a ready list. epoll_wait's job is actually to look in this ready list to see if fd is ready
3. epoll has no additional restrictions on file descriptors
select.epoll(sizehint=-1, flags=0) create epoll object
epoll.close()
Close the control file descriptor of the epoll object. Shut down epoll Object file descriptor
epoll.closed
True if the epoll object is closed. detection epoll Whether the object is closed or not
epoll.fileno()
Return the file descriptor number of the control fd. return epoll Object file descriptor
epoll.fromfd(fd)
Create an epoll object from a given file descriptor. According to the specified fd create epoll object
epoll.register(fd[, eventmask])
Register a fd descriptor with the epoll object. to epoll Register in object fd And the corresponding events
epoll.modify(fd, eventmask)
Modify a registered file descriptor. Modify the fd In the event
epoll.unregister(fd)
Remove a registered file descriptor from the epoll object. Cancel the registration
epoll.poll(timeout=-1, maxevents=-1)
Wait for events. timeout in seconds (float) Block until registered fd events , Returns the 1 a dict , the format is: {(fd1,event1),(fd2,event2), ... (fdn,eventn)}
Events:
EPOLLIN Available for read Can be read State for 1
EPOLLOUT Available for write Can write State for 4
EPOLLPRI Urgent data for read
EPOLLERR Error condition happened on the assoc. fd An error occurred State for 8
EPOLLHUP Hang up happened on the assoc. fd Pending state
EPOLLET Set Edge Trigger behavior, the default is Level Trigger behavior It fires horizontally by default, and edges fire when the event is set
EPOLLONESHOT Set one-shot behavior. After one event is pulled out, the fd is internally disabled
EPOLLRDNORM Equivalent to EPOLLIN
EPOLLRDBAND Priority data band can be read.
EPOLLWRNORM Equivalent to EPOLLOUT
EPOLLWRBAND Priority data may be written.
EPOLLMSG Ignored.
Horizontal trigger and edge trigger:
Level_triggered(horizontal trigger, sometimes called conditional trigger) : when a read-write event occurs on the monitored file descriptor,
epoll.poll()
The handler is notified to read and write. If you don't read and write the data all once (such as the read and write buffer is too small), call it again
epoll.poll()
, it will also tell you to continue reading and writing on the unfinished file descriptor, of course, if you do not read and write, it will tell you 1 straight!! If you have a large number of ready file descriptors in the system that you don't need to read or write to, and they return every time, it makes it much less efficient for the handler to retrieve the ready file descriptors it CARES about!! The advantages are obvious: stability and reliability
Edge_triggered(edge trigger, sometimes called state trigger) : when a read-write event occurs on the monitored file descriptor,
epoll.poll()
The handler is notified to read and write. If you don't read and write all the data this time (such as the read and write buffer is too small), call it again
epoll.poll()
It will not notify you until the second read-write event appears on the file descriptor!! This mode is more efficient than horizontal triggering, and the system won't be flooded with ready file descriptors you don't care about!! Cons: unreliable under certain conditions
epoll instances:
import socket
import select
s = socket.socket()
s.bind(('127.0.0.1',8888))
s.listen(5)
epoll_obj = select.epoll()
epoll_obj.register(s,select.EPOLLIN)
connections = {}
while True:
events = epoll_obj.poll()
for fd, event in events:
print(fd,event)
if fd == s.fileno():
conn, addr = s.accept()
connections[conn.fileno()] = conn
epoll_obj.register(conn,select.EPOLLIN)
msg = conn.recv(200)
conn.sendall('ok'.encode())
else:
try:
fd_obj = connections[fd]
msg = fd_obj.recv(200)
fd_obj.sendall('ok'.encode())
except BrokenPipeError:
epoll_obj.unregister(fd)
connections[fd].close()
del connections[fd]
s.close()
epoll_obj.close()
import socket
flag = 1
s = socket.socket()
s.connect(('127.0.0.1',8888))
while flag:
input_msg = input('input>>>')
if input_msg == '0':
break
s.sendall(input_msg.encode())
msg = s.recv(1024)
print(msg.decode())
s.close()
conclusion