Set the HTTP agent in the Python program

  • 2020-05-17 05:45:25
  • OfStack

0 x00 preface

You should all be familiar with the HTTP agent, which is widely used in many ways. HTTP agents are divided into forward agents and reverse agents. The latter is generally used to provide services behind the firewall to users for access or load balancing. Typical agents include Nginx, HAProxy and so on. This article discusses the positive proxy.

The most common USES of the HTTP agent are for network sharing, network acceleration, and network limit breaking. In addition, HTTP agent is often used for application debugging of Web and monitoring and analysis of Web API called in Android/IOS APP. Currently, the well-known software includes Fiddler, Charles, Burp Suite and mitmproxy, etc. The HTTP agent can also be used for request/response content modification, adding additional functionality to the Web application or changing application behavior without changing the server.

What is the agent of 0x01 HTTP

The HTTP proxy is essentially an Web application, which is not fundamentally different from other normal Web applications. Upon receipt of the request, the HTTP agent judges the target host based on the host name of the Host field in Header and the request address of Get/POST, establishes a new HTTP request and forwards the request data, and forwards the received response data to the client.

If the request address is an absolute address, the HTTP proxy takes Host in that address, otherwise the HOST field in Header is used. Do a simple test, assuming the network environment is as follows:


192.168.1.2 Web The server 
192.168.1.3 HTTP Proxy server 

telnet was used for testing


$ telnet 192.168.1.3
GET / HTTP/1.0
HOST: 192.168.1.2

Note that two consecutive carriage returns are required at the end, as required by the HTTP protocol. When you are done, you can receive the page content for http://192.168.1.2/. Now make a 1 adjustment, GET request with the absolute address


$ telnet 192.168.1.3
GET http://httpbin.org/ip HTTP/1.0
HOST: 192.168.1.2

Note here also set the HOST 192.168.1.2 instead, but the results are returned to the http: / / httpbin org/ip page content, namely public IP address information.

As you can see from the above test, the HTTP proxy is not very complex, just send the original request to the proxy server. In the case that the HTTP agent cannot be set up, the easiest way for a small number of Host to go through the HTTP agent is to point the IP of the target Host domain name to the proxy server, which can be achieved by modifying the hosts file.

The HTTP agent is set in the 0x02 Python program

urllib2/urllib proxy Settings

urllib2 It is the Python standard library. It is very powerful, but it is a little bit troublesome to use. In Python 3, urllib2 is no longer retained and migrated to the urllib module. The proxy server is set up using ProxyHandler in urllib2.


proxy_handler = urllib2.ProxyHandler({'http': '121.193.143.249:80'})
opener = urllib2.build_opener(proxy_handler)
r = opener.open('http://httpbin.org/ip')
print(r.read())

You can also use install_opener to install the configured opener into the global environment so that all urllib2.urlopen automatically use the proxy.


urllib2.install_opener(opener)
r = urllib2.urlopen('http://httpbin.org/ip')
print(r.read())

In Python 3, urllib is used.


proxy_handler = urllib.request.ProxyHandler({'http': 'http://121.193.143.249:80/'})
opener = urllib.request.build_opener(proxy_handler)
r = opener.open('http://httpbin.org/ip')
print(r.read())

requests proxy Settings

requests is currently one of the best HTTP libraries, and it is also the library I use most when constructing http requests. Its API design is very user-friendly and easy to use. Give requests agents set up is very simple, only need to proxies set 1 like {' http ':' x. x. x. x: 8080 ', 'https' : 'x. x. x. x: 8080} parameter. Among them, http and https are independent from each other.


In [5]: requests.get('http://httpbin.org/ip', proxies={'http': '121.193.143.249:80'}).json()
Out[5]: {'origin': '121.193.143.249'}

You can set the proxies property of session directly, eliminating the need to take the proxies parameter with every request.


s = requests.session()
s.proxies = {'http': '121.193.143.249:80'}
print(s.get('http://httpbin.org/ip').json())

0x03 HTTP_PROXY/HTTPS_PROXY environment variable

Both the urllib2 and Requests libraries recognize HTTP_PROXY and HTTPS_PROXY environment variables, and once these environment variables are detected, the agent is automatically set to use. This is useful when debugging with the HTTP proxy, as you can adjust the proxy server's ip address and port based on the environment variables without changing the code. * most of the software in nix also supports HTTP_PROXY environment variable identification, such as curl, wget, axel, aria2c, etc.


$ http_proxy=121.193.143.249:80 python -c 'import requests; print(requests.get("http://httpbin.org/ip").json())'
{u'origin': u'121.193.143.249'}

$ http_proxy=121.193.143.249:80 curl httpbin.org/ip
{
 "origin": "121.193.143.249"
}

In an IPython interaction environment, it may often be necessary to temporarily debug HTTP requests by simply setting os.environ ['http_proxy'] to add/remove the HTTP proxy.


In [245]: os.environ['http_proxy'] = '121.193.143.249:80'
In [246]: requests.get("http://httpbin.org/ip").json()
Out[246]: {u'origin': u'121.193.143.249'}
In [249]: os.environ['http_proxy'] = ''
In [250]: requests.get("http://httpbin.org/ip").json()
Out[250]: {u'origin': u'x.x.x.x'}

0x04 MITM-Proxy

MITM is derived from Man-in-the-Middle Attack, which refers to man-in-the-middle attack, which intercepts, listens to and tamper with data in the network between the client and the server.

mitmproxy is an open source middleman agent artifact developed by Python language, which supports SSL, transparent agent, reverse agent, traffic recording and playback, and custom script. It is similar in function to Fiddler in Windows, but mitmproxy is an console program without the GUI interface, but it is convenient to use. Using mitmproxy, you can easily filter, intercept and modify any HTTP request/response packets passing through the proxy. You can even use its scripting API to write scripts to automatically intercept and modify HTTP data.


$ telnet 192.168.1.3
GET / HTTP/1.0
HOST: 192.168.1.2
0

The script above adds an header named BOOM to all the proxied Http response headers. Start mitmproxy with the mitmproxy-s 'test. py' command, and curl verifies that there is indeed an extra BOOM header.


$ telnet 192.168.1.3
GET / HTTP/1.0
HOST: 192.168.1.2
1

Obviously, the mitmproxy script can do much more than that. Combined with the powerful functions of Python, many applications can be derived. In addition, mitmproxy also provides a powerful API, on the basis of these API, you can fully customize a special implementation of the exclusive proxy server.

After performance testing, it was found that mitmproxy was not particularly efficient. That's fine if you're just using it for debugging purposes, but if you're using a production environment, performance is a little worse when you have a lot of concurrent requests coming through the proxy. I implemented a simple proxy using twisted to add features and improve user experience to the company's internal website. I will have the opportunity to share with you later.


Related articles: