Usage Example of python requests Module
- 2021-10-25 07:16:59
- OfStack
Simulate get requests:
Simulated request header information
Simulate post Request requests Upload File requests Setting Agent time Module Setting Request Timeout
retrying Module Settings Refresh
cookie Settings
Handling Certificate Authentication Errors requests+jsonpath Parsing Data
Why use requests:
Support for holding sessions using Cookie
Support file upload
Support for automatically determining the encoding of response content
It is more humane for users
Simulate get request:
Get token
# Example of using WeChat public platform
get_param_dict={
"grant_type":"**************",
"appid":"**************",
"secret":"**************",
}
response = requests.get(url='https://api.weixin.qq.com/cgi-bin/token', # url Address
params=get_param_dict) # Parameter
print(response.content.decode('utf-8'))
Simulated request header information
Note: Because requests request header is initiated by python and requests, most interfaces need to add header information manually
# get Simulate the request header information (when you find that the data is wrong, simulate it)
# Take Baidu as an example
get_param_dict ={
"wd":"newdream"
}
# Add a header information dictionary (header information can be grabbed by using a grab package)
header_info_dict = {
"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36",
"Accpet":"text/plain, */*; q=0.01"
}
response = requests.get(url = 'https://www.baidu.com/s',
params=get_param_dict,headers=header_info_dict)
print(response.content.decode('utf-8'))
Simulate post request
import requests,json
# requests Analog transmission post Request
# Example of using WeChat public platform
url_param_doct = {"access_token": "43_XcK_1rvR8VPgicGGzq7Vp2QrGx30Kwhy9SSShoVTQs11G_jP9aqhy2bwRQFuG2hYzkwVjphJFfPj8WYQR8vgfu5Xej7KaZBiyPDJ9sYoCKte78sqgtBdCf6N5S8QosNXBOFSEJnzLMbxJwCOTWAgAAANQU"}
post_param_data = {
"tag" : { "name" : " I'm the new label " }
}
response = requests.post(url='https://api.weixin.qq.com/cgi-bin/tags/create',
params=url_param_doct,
# json=post_param_data # You can use the json
data=json.dumps(post_param_data) # You can also use the data , but data Requirement is a string, and you need to use json Module dumps Transformation
)
print(response.content.decode('utf-8'))
requests Upload File
import requests,os
# post Upload a file
current_path = os.path.dirname(__file__) # os Module locates the current path
excel_path = os.path.join(current_path,'..','data','j.xlsx') # join Splice
excel_file = {'file':open(excel_path,'rb')} # Make a dictionary, open Open a file rb : Read-only 2 Binary system
response = requests.post(url='https://2.python-requests.org/', # requests Official instance document address
files=excel_file) # files File transfer
print( response.content.decode('utf-8') )
requests Setup Agent
import requests
# Set up proxy: Why set up proxy?
# Reptile project with detection mechanism
# Prevent the company system from having anti-irrigation function
# When you need to climb over the wall to make an interface,
proxy_server = {'http':'http://127.0.0.1:8888',
'https':'http://127.0.0.1:8888'} # Do 1 A dictionary
proxy_user_pass = {
'https':'http://uesrname:password@127.0.0.1:8888' # User and password are required to use this
}
response = requests.get(url= 'https://baidu.com',
proxies=proxy_server) # proxies Set proxy keywords
print(response.status_code)
time Module Set Request Timeout
If a request does not result for a long time, the efficiency of the whole project will become very low, and we need to enforce the request at this time
Let him return the result within a specific time, otherwise he will report an error.
# Set request timeout
import requests
import time
print(time.time()) # Time stamp
response = requests.get(url='https://www.baidu.com',timeout=3) # timeout=3 : Request if within the specified time (3 Within seconds ) If no response is received, a timeout error will be thrown
print(time.time())
retrying Module Settings Refresh
Using the timeout parameter can speed up our overall request speed, but after normal web browsing has succeeded, if the speed is slow, we will choose to refresh the page
The retrying module can help us solve this problem. Use the retry module provided by the retrying module
By using the decorator, the parameter stop_max_attempt_number can be passed in the decorated function repeatedly executing retry, and the function can continue to execute again after reporting errors
Reach the upper limit of the maximum number of executions. If an error is reported every time, the whole function will report an error. If one is successful in the middle, the program will continue to execute later.
import requests
from retrying import retry
# If the function calls continuously 3 If you report an error every time, you will report an error , If 3 Among the times are 1 Success, success
@retry(stop_max_attempt_number=3)
def get_response(url):
response = requests.get(url, timeout=2)
return response
retrying_requests = get_response("https://www.baidu.com")
print(retrying_requests.content.decode())
cookie Settings
Benefit: Ability to access the login page
Disadvantages: A set of cookie often corresponds to the information of a user, and it is more likely to be identified as a crawler by the other party if the request is too frequent
How to solve it? Use multiple accounts
# Use requests Offered session Module
import requests
# Structure formdata Form data, fill in your account number and password
post_data = {
"username": "xxxxx",
"password": "xxxxx"
}
# session Use of : Create before the request session Object
session = requests.Session()
# Subsequent requests are made by session To initiate , Because session The login information of the user is saved in
session.post(url="https://www.baidu.com", data=post_data)
response = session.get("https://www.baidu.com")
# Use session Interface after login request
print(response.content.decode())
Handling Certificate Authentication Errors
import requests
# Mode 1 Do not verify certificate, report warning, return 200
requests.packages.urllib3.disable_warnings()# Directly solve the red warning
# Mode 2 Do not verify certificate, report warning, return 200 , back splicing verify=False, If you add this console to alarm, you will add the mode 1
response = requests.get('https://www.12306.cn',verify=False)
print(response.content.decode('utf-8'))
# Mode 3 : Installation pyopenssl No error will be reported after installation # pip3 install -U requests[security]
response = requests.get('https://www.12306.cn')
print(response.content.decode('utf-8'))
# Mode 4 : Add certificate Inside the company Ask development essentials xxx.crt Documents , the safest
response = requests.get('https://www.12306.cn',cert=('/path/server.crt', '/path/key'))
requests+jsonpath Parsing Data
hosts = 'https://api.weixin.qq.com' # Host address
# Get token
get_param_dict = {
"grant_type":"**********",
"appid":"*************",
"secret":"***************"
}
response = requests.get('%s/cgi-bin/token'%hosts,params=get_param_dict)
json_obj = response.json()
# json Data parsing: from 1 A json Take out the required data from the body, which is called json Data parsing
token_id = jsonpath.jsonpath(json_obj,'$.access_token')[0] # Interface dependency, interface association
print(token_id)
The above is the python requests module use of the details, more information about the use of python requests module please pay attention to other related articles on this site!