Several Ways to Share python Download Files
- 2021-10-25 07:14:47
- OfStack
2. stream using streaming request, requests. get method
3. Download files asynchronously
4. Split and download files asynchronously
5. Attention
1, 1 Synchronous Download
Sample code:
import requests
import os
def downlaod(url, file_path):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Firefox/68.0"
}
r = requests.get(url=url, headers=headers)
with open(file_path, "wb") as f:
f.write(r.content)
f.flush()
2. stream using streaming request, requests. get method
By default, the value of stream is false, which will immediately start downloading files and store them in memory. If the file is too large, it will lead to insufficient memory, and the program will report an error.
When the stream parameter of the get function is set to True, it will not start downloading immediately. It will only start downloading when you use iter_content or iter_lines to traverse the content or access the content properties. Note one point: It also needs to stay connected before the file is downloaded.
iter_content : 1 Block 1 Block traversal of the content to be downloaded
iter_lines : 1 Row 1 Traversal of rows to download content
Downloading large files using the above two functions prevents excessive memory usage because only a small portion of data is downloaded at a time.
Sample code:
3. Download files asynchronously
Since request requests are blocking, the aiohttp module is used to initiate the request.
Sample code:
import aiohttp
import asyncio
import os
async def handler(url, file_path):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Firefox/68.0"
}
async with aiohttp.ClientSession() as session:
r = await session.get(url=url, headers=headers)
with open(file_path, "wb") as f:
f.write(await r.read())
f.flush()
os.fsync(f.fileno())
loop = asyncio.get_event_loop()
loop.run_until_complete(handler(url, file_path))
4. Split and download files asynchronously
The above is a co-process to download a file, the following method is to divide the file into several parts, each part with a co-process to download, and finally write to the file.
The following example uses streaming writing, that is, writing contents to disk.
import aiohttp
import asyncio
import time
import os
async def consumer(queue):
option = await queue.get()
start = option["start"]
end = option["end"]
url = option["url"]
filename = option["filename"]
i = option["i"]
print(f" No. 1 {i} Tasks start running ")
async with aiohttp.ClientSession() as session:
headers = {"Range": f"bytes={start}-{end}"}
r = await session.get(url=url, headers=headers)
with open(filename, "rb+") as f:
f.seek(start)
while True:
chunk = await r.content.read(end - start)
if not chunk:
break
f.write(chunk)
f.flush()
os.fsync(f.fileno())
print(f" No. 1 {i} Tasks are being written ing")
queue.task_done()
print(f" No. 1 {i} Tasks were written successfully ")
async def producer(url, headers, filename, queue, coro_num):
async with aiohttp.ClientSession() as session:
resp = await session.head(url=url, headers=headers)
file_size = int(resp.headers["content-length"])
# Create 1 Files
with open(filename, "wb") as f:
pass
part = file_size // coro_num
for i in range(coro_num):
start = part * i
if i == coro_num - 1:
end = file_size
else:
end = start + part
info = {
"start": start,
"end": end,
"url": url,
"filename": filename,
"i": i,
}
queue.put_nowait(info)
async def main():
# Need to fill in the following url , filename , coro_num
url = ""
filename = ""
coro_num = 0
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Firefox/68.0"
}
queue = asyncio.Queue(coro_num)
await producer(url, headers, filename, queue, coro_num)
task_list = []
for i in range(coro_num):
task = asyncio.create_task(consumer(queue))
task_list.append(task)
await queue.join()
for i in task_list:
i.cancel()
await asyncio.gather(*task_list)
startt = time.time()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
end = time.time() - startt
print(f" Used {end} Seconds ")
5. Attention
The above examples are all introductory ideas. The program is not robust. Robust programs need to add error capture and error handling.
Above is the python download file several ways to share the details, more information about python download file please pay attention to other related articles on this site!