python download image implementation method of super simple

  • 2020-06-12 09:55:36
  • OfStack

We sometimes need to find on the Internet and download pictures, when the quantity is less, right click save, can easily achieve the download images, but there are some pictures of the special Settings, click on the right shows no save option, or you need to download a lot of pictures, such a situation, write 1 paragraph Python crawler code can be easily solved!

1. Page scraping


#coding=utf-8
  import urllib
  def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html
  html = getHtml("https://tieba.baidu.com/p/5582243679")
  print html

The page fetching process defines the getHtml() function, which passes a url to getHtml() and ultimately downloads the entire page.

2. Page data filtering


import re
  import urllib
  def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html
  def getImg(html):
    reg = r'src="(.+?\.jpg)" pic_ext'
    imgre = re.compile(reg)
    imglist = re.findall(imgre,html)
    return imglist
  html = getHtml("https://tieba.baidu.com/p/5582243679")
  print getImg(html)

In page data filtering, a new function getImg() is defined. The function is to screen out the image address in.ES15en format.

3. Picture download


#coding=utf-8
  import urllib
  import re
  def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html
  def getImg(html):
    reg = r'src="(.+?\.jpg)" pic_ext'
    imgre = re.compile(reg)
    imglist = re.findall(imgre,html)
    x = 0
    for imgurl in imglist:
      urllib.urlretrieve(imgurl,'%s.jpg' % x)
      x+=1
  html = getHtml("https://tieba.baidu.com/p/5582243679")
  print getImg(html)

Through for loop to get all eligible picture url, and use urllib. urlretrieve() method, the remote data download to the local, and renamed!

Here is the supplement

As shown below:


import urllib.request
response = urllib.request.urlopen('https://www.ofstack.com/g/500/600')
cat_img = response.read()

with open('cat_500_600.jpg','wb') as f:
 f.write(cat_img)

The urlopen() parentheses can be either a string or an request object, which is converted to an request object when passed in

response = urllib. request. urlopen (' https: / / www. ofstack. com/g / 500/600 ') can be written

req = urllib.request.Request('https://www.ofstack.com/g/500/600')

1, response = urllib.request.urlopen(req)
2. responce and geturl,info and getcode methods

with open('cat_500_600.jpg','wb') as f:

f. write (cat_img) is equivalent to

1, f = open(' es81EN_500_600. jpg','wb')

2, try:

3, data = f.write(cat_img)

4, finally:

5, f. close ()


Related articles: