Urllib2 custom opener details
- 2020-04-02 13:24:56
- OfStack
The urllib2. Urlopen () function does not support authentication, cookies, or other advanced HTTP functionality. To support these functions, you must create a custom Opener object using the build_opener() function.
build_opener([handler1 [ handler2, ... ]])
Parameter handler is a handler instance, commonly used HTTPBasicAuthHandler, HTTPCookieProcessor, ProxyHandler and so on.
Build_opener () returns an object with an open() method, the same as the urlopen() function.
To modify the HTTP header, use:
import urllib2
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.open('http://www.example.com/')
2. Install_opener (opener)
Install different opener objects as the global opener used by urlopen().
3. Password authentication (HTTPBasicAuthHandler)
The HTTPBasicAuthHandler() handler USES add_password() to set the password.
h.add_password(realm,uri,user,passwd)
A realm is the name or description information associated with validation, depending on the remote server. A uri is a base URL. User and passwd specify a username and password, respectively.
import urllib2
auth=urllib2.HTTPBasicAuthHandler()
auth.add_password('Administrator','http://www.example.com','Dave','123456')
opener=urllib2.build_opener(auth)
u=opener.open('http://www.example.com/evilplan.html')
4. Cookie processor
import urllib2,cookielib
cookie=cookielib.CookieJar()
cookiehand=urllib2.HTTPCookieProcessor(cookie)
opener=urllib2.build_opener(cookiehand)
5. The agent (ProxyHandler)
ProxyHandler proxies (proxies) parameter is a dictionary, the protocol name (such as HTTP, FTP) is mapped to the corresponding proxy server URL.
proxy=ProxyHandler({'http':'http://someproxy.com:8080'})
auth=HTTPBasicAuthHandler()
auth.add_password()
opener=build_opener(auth,proxy)
Agents can also be used in urlopen
import urllib2
proxy = 'http://%s:%s@%s' % ('userName', 'password', 'proxy')
inforMation = urllib2.urlopen("http://www.example.com", proxies={'http':proxy})