Summary of Four Benefits of python Thread Pool
- 2021-11-02 01:31:56
- OfStack
1. Benefits of use
Improve performance: Reuse thread resources because of reducing the cost of a large number of new terminated threads;
Applicable scenario: It is suitable for handling a large number of sudden requests or requiring a large number of threads to complete tasks, but the actual task processing time is short.
Defense function: It can effectively avoid the problem that the system is overloaded and slows down due to too many threads.
Code Advantage: Using thread pool syntax is simpler than creating your own threads.
2. Examples
"""
@file : 004- Use of thread pool .py
@author : xiaolu
@email : luxiaonlp@163.com
@time : 2021-02-01
"""
import concurrent.futures
import requests
from bs4 import BeautifulSoup
def craw(url):
# Crawl the content of a web page
r = requests.get(url)
return r.text
def parse(html):
# Parse the contents
soup = BeautifulSoup(html, "html.parser")
links = soup.find_all("a", class_="post-item-title")
return [(link["href"], link.get_text()) for link in links] # Take out the link and title
if __name__ == '__main__':
# Links to Web pages to be crawled
urls = [
"https://www.cnblogs.com/sitehome/p/{}".format(page) for page in range(1, 50 + 1)
]
# craw
with concurrent.futures.ThreadPoolExecutor() as pool:
htmls = pool.map(craw, urls)
htmls = list(zip(urls, htmls))
for url, html in htmls:
print(url, len(html))
print("craw over")
# parse
with concurrent.futures.ThreadPoolExecutor() as pool:
futures = {}
for url, html in htmls:
future = pool.submit(parse, html)
futures[future] = url
# for future, url in futures.items():
# print(url, future.result())
for future in concurrent.futures.as_completed(futures):
url = futures[future]
print(url, future.result())
Knowledge point supplement:
Use of thread pool
The base class for thread pools is Executor in the concurrent. futures module, and Executor provides two subclasses, ThreadPoolExecutor and ProcessPoolExecutor, where ThreadPoolExecutor is used to create thread pools and ProcessPoolExecutor is used to create process pools.
If you use thread pool/process pool to manage concurrent programming, you simply submit the corresponding task function to the thread pool/process pool, and the rest is done by the thread pool/process pool.
Exectuor provides the following common methods:
submit (fn, *args, **kwargs): Commits the fn function to the thread pool. * args represents the parameters passed to the fn function, and * kwargs represents the parameters passed to the fn function in the form of keyword parameters.
map (func, * iterables, timeout=None, chunksize=1): This function is similar to the global function map (func, * iterables), except that it starts multiple threads and immediately performs map processing on iterables asynchronously.
shutdown (wait=True): Close the thread pool.
After the program submits the task function (submit) to the thread pool, the submit method will return an Future object, and the Future class is mainly used to obtain the return value of the thread task function. Python is represented by Future because threaded tasks are executed asynchronously in the new thread, so the function executed by the thread is equivalent to a "completed in the future" task.