Urllib module usage examples in python

  • 2020-04-02 14:21:08
  • OfStack

This article illustrates the use of the urllib module in python. Share with you for your reference. Specific analysis is as follows:

I. problems:

According to the API provided by the customer, we need to acquire data regularly. The previous plan was to use PHP to collect tasks and store them in the redis queue, and then make a resident process to run a PHP file under Linux. The PHP file is an infinite loop.

Ii. Solutions:

I recently learned python, whose urllib module is probably faster and simpler than PHP's curl

#_*_ coding:utf-8 _*_
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
import os
import json
from urllib import urlopen
doc = urlopen("http://xxxx?webid=1&tid=901&cateid=101").read()
doc = json.loads(doc)
print doc
print doc.keys()
print doc["msg"]
print doc['data']
print doc['ret']

Found that the time required for the first access was [Finished in 3.0s]
The time of the second visit was [Finished in 0.2s]
So the python urllib module is cached
A typical example of the use of urllib/2
    import urllib2
    import cookielib
    import urllib     class Hi_login:
        def __init__(self):
            cookie = cookielib.CookieJar()
            self.cookie = urllib2.HTTPCookieProcessor(cookie) ##### generate cookie ###         def login(self,user,pwd):
            url='http://passport.baidu.com/?login'
            postdata=urllib.urlencode({
          'mem_pass':'on',
       
          'password':pwd
           'Submit':'',
          'tpl':'sp',
          'tp_reg':'sp',
          'u' :'http://hi.baidu.com',
          'username':user})
            ### proxy_support = urllib2.ProxyHandler({"http":"http://ahad-haam:3128"}) Then add the opener In the method ####
            opener = urllib2.build_opener(self.cookie) ### use cookie ###
            headers = { ####### dict Structure, can be added x-forward-for even refer Etc. #######
           'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}
            urllib2.install_opener(opener)
            request = urllib2.Request(url,urllib.urlencode(postdata),headers = headers)
            urllib2.urlopen(request)
    if __name__=='__main__':
       pwd='123456'
       user='xiaofu'
       test=Hi_login()
       test.login(user,pwd)

If you go to a page that needs to be authenticated, like the nagios monitoring page,
    import urllib2
    password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
    url = "http://202.1.x.y/nagios"
    password_mgr.add_password(None, url, user='abc',passwd='xxxxxx')
    handler = urllib2.HTTPBasicAuthHandler(password_mgr)
    opener = urllib2.build_opener(handler)
    urllib2.install_opener(opener)
    f=urllib2.urlopen(url)
    print f.code

Returns a result of 200, otherwise it is a 401 authentication error

I hope this article has helped you with your Python programming.


Related articles: