python download image implementation method of super simple
- 2020-06-12 09:55:36
- OfStack
We sometimes need to find on the Internet and download pictures, when the quantity is less, right click save, can easily achieve the download images, but there are some pictures of the special Settings, click on the right shows no save option, or you need to download a lot of pictures, such a situation, write 1 paragraph Python crawler code can be easily solved!
1. Page scraping
#coding=utf-8
import urllib
def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html
html = getHtml("https://tieba.baidu.com/p/5582243679")
print html
The page fetching process defines the getHtml() function, which passes a url to getHtml() and ultimately downloads the entire page.
2. Page data filtering
import re
import urllib
def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html
def getImg(html):
reg = r'src="(.+?\.jpg)" pic_ext'
imgre = re.compile(reg)
imglist = re.findall(imgre,html)
return imglist
html = getHtml("https://tieba.baidu.com/p/5582243679")
print getImg(html)
In page data filtering, a new function getImg() is defined. The function is to screen out the image address in.ES15en format.
3. Picture download
#coding=utf-8
import urllib
import re
def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html
def getImg(html):
reg = r'src="(.+?\.jpg)" pic_ext'
imgre = re.compile(reg)
imglist = re.findall(imgre,html)
x = 0
for imgurl in imglist:
urllib.urlretrieve(imgurl,'%s.jpg' % x)
x+=1
html = getHtml("https://tieba.baidu.com/p/5582243679")
print getImg(html)
Through for loop to get all eligible picture url, and use urllib. urlretrieve() method, the remote data download to the local, and renamed!
Here is the supplement
As shown below:
import urllib.request
response = urllib.request.urlopen('https://www.ofstack.com/g/500/600')
cat_img = response.read()
with open('cat_500_600.jpg','wb') as f:
f.write(cat_img)
The urlopen() parentheses can be either a string or an request object, which is converted to an request object when passed in
response = urllib. request. urlopen (' https: / / www. ofstack. com/g / 500/600 ') can be written
req = urllib.request.Request('https://www.ofstack.com/g/500/600')
1, response = urllib.request.urlopen(req)
2. responce and geturl,info and getcode methods
with open('cat_500_600.jpg','wb') as f:
f. write (cat_img) is equivalent to
1, f = open(' es81EN_500_600. jpg','wb')
2, try:
3, data = f.write(cat_img)
4, finally:
5, f. close ()