Tornado Multi process implementation Analysis in detail

  • 2020-06-23 01:09:57
  • OfStack

primers

Tornado is a network asynchronous web development framework and can take advantage of multi-process to improve efficiency. Here is an example of creating a multi-process tornado program.


#!/usr/bin/env python
# -*- coding:utf-8 -*-
import os
import time

import tornado.web
import tornado.httpserver
import tornado.ioloop
import tornado.netutil
import tornado.process


class LongHandler(tornado.web.RequestHandler):

	def get(self):
		self.write(str(os.getpid()))
		time.sleep(10)


if __name__ == "__main__":
	app = tornado.web.Application(([r'/', LongHandler], ))
	sockets = tornado.netutil.bind_sockets(8090)
	tornado.process.fork_processes(2)
	server = tornado.httpserver.HTTPServer(app)
	server.add_sockets(sockets)
	tornado.ioloop.IOLoop.instance().start()

The above code creates two child processes using tornado.process.fork_processes. Accessing the service twice at the same time returns two adjacent pids.You can see that tornado does use both processes to complete the task at the same time.

I am always curious about how tornado schedules requests to child processes and how multiple child processes do not process one request at a time.

To explore the

We first call tornado.netutil.bind_sockets to create 1 socket(or 1 socket list),

Then we call tornado.process.fork_processes to the fork child. Reading the code of this function, we will see that this function simply creates the child process and then the main process is responsible for waiting for the child process. If the child exits, the child process will be restarted according to the condition.

After the function is called, the function in the child process returns, and the child process continues to execute the code after the function is called.

We did the following after the fork child process.


server = tornado.httpserver.HTTPServer(app)
  server.add_sockets(sockets)
  tornado.ioloop.IOLoop.instance().start()

. Let's look at tornado httpserver. HTTPServer. add_sockets found HTTPServer is inherited tornado netutil. TCPServer, in TCPServer add_sockets is implementation

tornado.netutil.TCPServer.add_sockets


def add_sockets(self, sockets):
		if self.io_loop is None:
			self.io_loop = IOLoop.instance()

		for sock in sockets:
			self._sockets[sock.fileno()] = sock
			add_accept_handler(sock, self._handle_connection,
							  io_loop=self.io_loop)

It maps the file descriptors corresponding to socket and socket. Let's look at the call to add_accept_handler


def add_accept_handler(sock, callback, io_loop=None):
	if io_loop is None:
		io_loop = IOLoop.instance()

	def accept_handler(fd, events):
		while True:
			try:
				connection, address = sock.accept()
			except socket.error as e:
				if e.args[0] in (errno.EWOULDBLOCK, errno.EAGAIN):
					return
				raise
			callback(connection, address)
	io_loop.add_handler(sock.fileno(), accept_handler, IOLoop.READ)

We know that when I/O multiplexing is processing server socket, when a connection request comes, it will trigger a readable event. This function registers socket in the main event loop to read the event (IOLoop.READ), and its callback creates the connection


if e.args[0] in (errno.EWOULDBLOCK, errno.EAGAIN):
          return
        raise

Found that this exception is skipped when creating a connection. Why? So what are EWOULDBLOCK and EAGAIN? You know by looking it up that it means in non-blocking mode, no re-reading or overwriting, EAGAIN is the name of EWOULDBLOCK on Windows, so it's pretty clear here.

conclusion

The Tornado multi-process process works by creating socket and then fork children, so that all the children are actually listening for 1 (or more) file descriptors, that is, they are all listening for the same socket.

When connected, all children receive a readable event, at which point all children jump to the accept_handler callback to try to establish a connection.

Once one of the child processes successfully establishes the connection, an EWOULDBLOCK (or EAGAIN) error is triggered when the other child processes attempt to establish the connection again. The callback function determines the error and returns the function without processing.

When another connection comes along while the successful child is still processing the connection, another child will take over the connection.

Tornado USES such a mechanism to increase efficiency by using multiple processes. Since connections can only be successfully created by one child process, the same request will not be processed by multiple children processes.

Afterword.

After writing, I found that the code I used was tornado-2.4.post2 version, and the current latest code is 3.3.0. I checked the latest code, and the latest code TCPServer was written in tornado. tcpserver alone.

Category:PythonTagged:Pythonfork_processestornado Multi-process web improves efficiency

That's the end of this article on Tornado multi-process implementation analysis, I hope you find it helpful. Interested friends can continue to refer to other related topics in this site, if there is any deficiency, welcome to comment out. Thank you for your support!


Related articles: