Urllib2 custom opener details

  • 2020-04-02 13:24:56
  • OfStack

The urllib2. Urlopen () function does not support authentication, cookies, or other advanced HTTP functionality. To support these functions, you must create a custom Opener object using the build_opener() function.


build_opener([handler1 [ handler2, ... ]])

Parameter handler is a handler instance, commonly used HTTPBasicAuthHandler, HTTPCookieProcessor, ProxyHandler and so on.

Build_opener () returns an object with an open() method, the same as the urlopen() function.

To modify the HTTP header, use:


import urllib2
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.open('http://www.example.com/')

2. Install_opener (opener)

Install different opener objects as the global opener used by urlopen().

3. Password authentication (HTTPBasicAuthHandler)

The HTTPBasicAuthHandler() handler USES add_password() to set the password.


h.add_password(realm,uri,user,passwd)

A realm is the name or description information associated with validation, depending on the remote server. A uri is a base URL. User and passwd specify a username and password, respectively.


import urllib2
auth=urllib2.HTTPBasicAuthHandler()
auth.add_password('Administrator','http://www.example.com','Dave','123456')
opener=urllib2.build_opener(auth)
u=opener.open('http://www.example.com/evilplan.html')

4. Cookie processor


import urllib2,cookielib
cookie=cookielib.CookieJar()
cookiehand=urllib2.HTTPCookieProcessor(cookie)
opener=urllib2.build_opener(cookiehand)

5. The agent (ProxyHandler)

ProxyHandler proxies (proxies) parameter is a dictionary, the protocol name (such as HTTP, FTP) is mapped to the corresponding proxy server URL.


proxy=ProxyHandler({'http':'http://someproxy.com:8080'})
auth=HTTPBasicAuthHandler()
auth.add_password()
opener=build_opener(auth,proxy)

Agents can also be used in urlopen


import urllib2  
proxy = 'http://%s:%s@%s' % ('userName', 'password', 'proxy')  
inforMation = urllib2.urlopen("http://www.example.com", proxies={'http':proxy})  


Related articles: