Usage Example of python requests Module

2021-10-25 07:16:59
OfStack

Why do directories use requests:
Simulate get requests:
Simulated request header information
Simulate post Request requests Upload File requests Setting Agent time Module Setting Request Timeout
retrying Module Settings Refresh
cookie Settings
Handling Certificate Authentication Errors requests+jsonpath Parsing Data

Why use requests:

Support for holding sessions using Cookie Support file upload Support for automatically determining the encoding of response content It is more humane for users

Simulate get request:

Get token


#  Example of using WeChat public platform 
get_param_dict={
 "grant_type":"**************",
 "appid":"**************",
 "secret":"**************",
}
response = requests.get(url='https://api.weixin.qq.com/cgi-bin/token', # url Address 
      params=get_param_dict) #  Parameter 
print(response.content.decode('utf-8'))

Simulated request header information

Note: Because requests request header is initiated by python and requests, most interfaces need to add header information manually


# get  Simulate the request header information (when you find that the data is wrong, simulate it) 
#  Take Baidu as an example 
get_param_dict ={
 "wd":"newdream"
}
#  Add a header information dictionary (header information can be grabbed by using a grab package) 
header_info_dict = {
 "User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36",
 "Accpet":"text/plain, */*; q=0.01"
}
response = requests.get(url = 'https://www.baidu.com/s',
      params=get_param_dict,headers=header_info_dict)
print(response.content.decode('utf-8'))

Simulate post request


import requests,json
# requests Analog transmission post Request 
#  Example of using WeChat public platform 
url_param_doct = {"access_token": "43_XcK_1rvR8VPgicGGzq7Vp2QrGx30Kwhy9SSShoVTQs11G_jP9aqhy2bwRQFuG2hYzkwVjphJFfPj8WYQR8vgfu5Xej7KaZBiyPDJ9sYoCKte78sqgtBdCf6N5S8QosNXBOFSEJnzLMbxJwCOTWAgAAANQU"}
post_param_data = {
 "tag" : {  "name" : " I'm the new label " }
}
response = requests.post(url='https://api.weixin.qq.com/cgi-bin/tags/create',
       params=url_param_doct,
       # json=post_param_data #  You can use the json
       data=json.dumps(post_param_data) #  You can also use the data , but data Requirement is a string, and you need to use json Module dumps Transformation 
       )
print(response.content.decode('utf-8'))

requests Upload File


import requests,os
# post Upload a file 
current_path = os.path.dirname(__file__) # os Module locates the current path 
excel_path = os.path.join(current_path,'..','data','j.xlsx') # join Splice 
excel_file = {'file':open(excel_path,'rb')} #  Make a dictionary, open Open a file  rb : Read-only 2 Binary system 
response = requests.post(url='https://2.python-requests.org/', # requests Official instance document address 
       files=excel_file) # files File transfer 
print( response.content.decode('utf-8') )

requests Setup Agent


import requests
#  Set up proxy: Why set up proxy? 
#  Reptile project with detection mechanism 
#  Prevent the company system from having anti-irrigation function 
#  When you need to climb over the wall to make an interface, 
proxy_server = {'http':'http://127.0.0.1:8888',
    'https':'http://127.0.0.1:8888'} #  Do 1 A dictionary 
proxy_user_pass = {
 'https':'http://uesrname:password@127.0.0.1:8888' #  User and password are required to use this 
}
response = requests.get(url= 'https://baidu.com',
      proxies=proxy_server) # proxies Set proxy keywords 
print(response.status_code)

time Module Set Request Timeout

If a request does not result for a long time, the efficiency of the whole project will become very low, and we need to enforce the request at this time

Let him return the result within a specific time, otherwise he will report an error.


#  Set request timeout 
import requests
import time
print(time.time()) #  Time stamp 
response = requests.get(url='https://www.baidu.com',timeout=3) # timeout=3 :   Request if within the specified time (3 Within seconds ) If no response is received, a timeout error will be thrown 
print(time.time())

retrying Module Settings Refresh

Using the timeout parameter can speed up our overall request speed, but after normal web browsing has succeeded, if the speed is slow, we will choose to refresh the page

The retrying module can help us solve this problem. Use the retry module provided by the retrying module

By using the decorator, the parameter stop_max_attempt_number can be passed in the decorated function repeatedly executing retry, and the function can continue to execute again after reporting errors

Reach the upper limit of the maximum number of executions. If an error is reported every time, the whole function will report an error. If one is successful in the middle, the program will continue to execute later.


import requests
from retrying import retry


#  If the function calls continuously 3 If you report an error every time, you will report an error , If 3 Among the times are 1 Success, success 
@retry(stop_max_attempt_number=3)
def get_response(url):
 response = requests.get(url, timeout=2)
 return response
retrying_requests = get_response("https://www.baidu.com")
print(retrying_requests.content.decode())

cookie Settings

Benefit: Ability to access the login page

Disadvantages: A set of cookie often corresponds to the information of a user, and it is more likely to be identified as a crawler by the other party if the request is too frequent
How to solve it? Use multiple accounts


#  Use requests Offered session Module 
import requests
#  Structure formdata Form data, fill in your account number and password 
post_data = {
 "username": "xxxxx",
 "password": "xxxxx"
}
# session Use of :  Create before the request session Object 
session = requests.Session()
#  Subsequent requests are made by session To initiate , Because session The login information of the user is saved in 
session.post(url="https://www.baidu.com", data=post_data)
response = session.get("https://www.baidu.com")
#  Use session Interface after login request 
print(response.content.decode())

Handling Certificate Authentication Errors


import requests
#  Mode 1 Do not verify certificate, report warning, return 200
requests.packages.urllib3.disable_warnings()#  Directly solve the red warning 

#  Mode 2 Do not verify certificate, report warning, return 200  , back splicing verify=False, If you add this console to alarm, you will add the mode 1
response = requests.get('https://www.12306.cn',verify=False)
print(response.content.decode('utf-8'))

#  Mode 3 : Installation pyopenssl  No error will be reported after installation # pip3 install -U requests[security] 
response = requests.get('https://www.12306.cn')
print(response.content.decode('utf-8'))

#  Mode 4 :   Add certificate   Inside the company   Ask development essentials xxx.crt Documents   , the safest 
response = requests.get('https://www.12306.cn',cert=('/path/server.crt', '/path/key'))

requests+jsonpath Parsing Data


hosts = 'https://api.weixin.qq.com' #  Host address 
#  Get token
get_param_dict = {
 "grant_type":"**********",
 "appid":"*************",
 "secret":"***************"
}
response = requests.get('%s/cgi-bin/token'%hosts,params=get_param_dict)
json_obj = response.json()
 # json Data parsing: from 1 A json Take out the required data from the body, which is called json Data parsing 
token_id = jsonpath.jsonpath(json_obj,'$.access_token')[0] #  Interface dependency, interface association 
print(token_id)

The above is the python requests module use of the details, more information about the use of python requests module please pay attention to other related articles on this site!