Urllib module usage examples in python

2020-04-02 14:21:08
OfStack

This article illustrates the use of the urllib module in python. Share with you for your reference. Specific analysis is as follows:

I. problems:

According to the API provided by the customer, we need to acquire data regularly. The previous plan was to use PHP to collect tasks and store them in the redis queue, and then make a resident process to run a PHP file under Linux. The PHP file is an infinite loop.

Ii. Solutions:

I recently learned python, whose urllib module is probably faster and simpler than PHP's curl

#_*_ coding:utf-8 _*_

import sys

reload(sys)

sys.setdefaultencoding( "utf-8" )

import os

import json

from urllib import urlopen

doc = urlopen("http://xxxx?webid=1&tid=901&cateid=101").read()

doc = json.loads(doc)

print doc

print doc.keys()

print doc["msg"]

print doc['data']

print doc['ret']

Found that the time required for the first access was [Finished in 3.0s]
The time of the second visit was [Finished in 0.2s]
So the python urllib module is cached
A typical example of the use of urllib/2

    import urllib2

    import cookielib

    import urllib
    class Hi_login:

        def __init__(self):

            cookie = cookielib.CookieJar()

            self.cookie = urllib2.HTTPCookieProcessor(cookie) #####  generate cookie ###
        def login(self,user,pwd):

            url='http://passport.baidu.com/?login'

            postdata=urllib.urlencode({

          'mem_pass':'on',

        

          'password':pwd

           'Submit':'',

          'tpl':'sp',

          'tp_reg':'sp',

          'u' :'http://hi.baidu.com',

          'username':user})

            ### proxy_support = urllib2.ProxyHandler({"http":"http://ahad-haam:3128"})  Then add the opener In the method ####

            opener = urllib2.build_opener(self.cookie) ###  use cookie ###

            headers = { ####### dict Structure, can be added x-forward-for even refer Etc.  #######

           'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}

            urllib2.install_opener(opener)

            request = urllib2.Request(url,urllib.urlencode(postdata),headers = headers)

            urllib2.urlopen(request)

    if __name__=='__main__':

       pwd='123456'

       user='xiaofu'

       test=Hi_login()

       test.login(user,pwd)

If you go to a page that needs to be authenticated, like the nagios monitoring page,

    import urllib2

    password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()

    url = "http://202.1.x.y/nagios"

    password_mgr.add_password(None, url, user='abc',passwd='xxxxxx')

    handler = urllib2.HTTPBasicAuthHandler(password_mgr)

    opener = urllib2.build_opener(handler)

    urllib2.install_opener(opener)

    f=urllib2.urlopen(url)

    print f.code

Returns a result of 200, otherwise it is a 401 authentication error

I hope this article has helped you with your Python programming.