|
1 | | -# python-proxy-headers |
| 1 | +The `python-proxy-headers` package provides support for handling custom proxy headers when making HTTP requests in various python modules. |
2 | 2 |
|
3 | | -## urllib3 |
| 3 | +We currently add support to the following packages: |
| 4 | +* urllib3 |
| 5 | +* requests |
| 6 | +* httpx |
| 7 | +* aiohttp |
4 | 8 |
|
5 | | -ProxyManager inherits from PoolManager, __init__ accepts proxy_headers kwarg, puts into connection_pool_kw |
| 9 | +None of these modules provide good support for parsing custom response headers from proxy servers. And some of them make it hard to send custom headers to proxy servers. So we at [ProxyMesh](https://proxymesh.com) made these extension modules to support our customers that use Python and want to use custom headers to control our proxy behavior. But these modules can work for any custom headers through any proxy. |
6 | 10 |
|
7 | | -PoolManager passes connection_pool_kw as kwargs into HTTPSConnectionPool (HTTPConnectionPool just adds them to normal headers) |
| 11 | +Examples for how to use our support modules are described below. |
8 | 12 |
|
9 | | -//urlopen() creates conn through multiple calls ending up in _new_pool() |
| 13 | +## urllib3 |
10 | 14 |
|
11 | | -HTTPSConnectionPool __init__() receives _proxy_headers kwarg |
12 | | -in _prepare_proxy(), passes self.proxy_headers to HTTPSConnection.set_tunnel() as headers |
13 | | -_prepare_proxy() is called in HTTPConnectionPool.urlopen() with conn |
14 | | -conn is created in _new_conn() using ConnectionCls |
15 | | -_new_conn() is called by _get_conn() |
16 | | -HTTPSConnectionPool sets ConnectionCls to HTTPSConnection |
| 15 | +If you just want to send custom proxy headers, but don't need to receive proxy response headers, then you can use urllib3 directly, like so: |
17 | 16 |
|
18 | | -HTTPSConnection inherits from HTTPConnection |
19 | | -HTTPConnection inherits from http.client.HTTPConnection |
20 | | -HTTPConnection.set_tunnel() is simple wrapper around http.client.HTTPConnection.set_tunnel() |
21 | | -HTTPConnection defines own _tunnel() method for python < 3.11.4 |
22 | | -_tunnel() is called by connect() |
| 17 | +``` python |
| 18 | +import urllib3 |
| 19 | +proxy = urllib3.ProxyManager('http://PROXYHOST:PORT', proxy_headers={'X-ProxyMesh-Country': 'US'}) |
| 20 | +r = proxy.request('GET', 'https://api.ipify.org?format=json') |
| 21 | +``` |
23 | 22 |
|
24 | | -response headers are only available within _tunnel() method, set_tunnel() only sets the proxy headers |
| 23 | +Note that when using this method, if you keep reusing the same `ProxyManager` instance, you may be re-using the proxy connection, which may have different behavior than if you create a new proxy connection for each request. For example, with ProxyMesh, you may keep getting the same IP address if you reuse the proxy connection. |
25 | 24 |
|
26 | | -## http.client is python stdlib |
| 25 | +To get proxy response headers, use our extension module like this: |
27 | 26 |
|
28 | | -HTTPConnection.set_tunnel() receives headers, stores for passing in CONNECT method in _tunnel() |
| 27 | +``` python |
| 28 | +from python_proxy_headers import urllib3_proxy_manager |
| 29 | +proxy = urllib3_proxy_manager.ProxyHeaderManager('http://PROXYHOST:PORT') |
| 30 | +r = proxy.request('GET', 'https://api.ipify.org?format=json') |
| 31 | +r.headers['X-ProxyMesh-IP'] |
| 32 | +``` |
29 | 33 |
|
30 | | -in python3.12, _tunnel() reads proxy headers, saves them in _raw_proxy_headers |
31 | | -can get _raw_proxy_headers using get_proxy_response_headers() |
| 34 | +You can also pass `proxy_headers` into the `ProxyHeaderManager` as well. For example, you pass back the same `X-ProxyMesh-IP` to ensure you get the same IP address on subsequent requests. |
32 | 35 |
|
33 | | -for older python, need to patch _tunnel() to get response headers |
| 36 | +## requests |
| 37 | + |
| 38 | +The requests adapter builds on the `urllib3_proxy_manager` module to make it easy to pass in proxy headers and receive proxy response headers. |
34 | 39 |
|
35 | | -## TODO |
36 | | -1. figure easiest urllib3 based method to pass in proxy_headers for requests |
37 | 40 | ``` python |
38 | | -import urllib3 |
39 | | -proxy = urllib3.ProxyManager('http://de.proxymesh.com:31280', proxy_headers={'X-ProxyMesh-IP': '165.232.115.32'}) |
40 | | -r = proxy.request('GET', 'https://proxymesh.com/api/headers/') |
41 | | -# NOTE that when using this method, even without proxy_headers, the proxymesh proxy might still keep the same IP |
42 | | -# because urllib3 by default re-uses the connection |
| 41 | +from python_proxy_headers import requests_adapter |
| 42 | +r = requests_adapter.get('https://api.ipify.org?format=json', proxies={'http': 'http://PROXYHOST:PORT', 'https': 'http://PROXYHOST:PORT'}, proxy_headers={'X-ProxyMesh-Country': 'US'}) |
| 43 | +r.headers['X-ProxyMesh-IP'] |
43 | 44 | ``` |
44 | | -2. potentially create helper method(s) for doing this |
45 | | -3. figure out how to patch or extend urllib3 ProxyManager to get proxy response headers in python3.12 |
| 45 | + |
| 46 | +The `requests_adapter` module supports all the standard requests methods: `get`, `post`, `put`, `delete`, etc. |
| 47 | + |
| 48 | +## aiohttp |
| 49 | + |
| 50 | +While it's not documented, aiohttp does support passing in custom proxy headers by default. |
| 51 | + |
46 | 52 | ``` python |
47 | | -from python_proxy_headers import urllib3_proxy_manager |
48 | | -proxy = urllib3_proxy_manager.ProxyHeaderManager('http://de.proxymesh.com:31280', proxy_headers={'X-ProxyMesh-IP': '46.101.181.63'}) |
49 | | -r = proxy.request('GET', 'https://proxymesh.com/api/headers/') |
50 | | -r.headers['X-ProxyMesh-IP'] |
| 53 | +import aiohttp |
| 54 | +async with aiohttp.ClientSession() as session: |
| 55 | + async with session.get('https://api.ipify.org?format=json', proxy="http://PROXYHOST:PORT", proxy_headers={'X-ProxyMesh-Country': 'US'}) as r: |
| 56 | + await r.text() |
51 | 57 | ``` |
52 | | -4. figure out how to do create equivalent functionality for older pythons |
53 | | - * tested with python3.7 & urllib3 1.26.20 |
54 | | - * tested with python3.12 & urllib3 2.3.0 |
55 | | -5. figure out how python requests uses urllib3 and easiest method for passing in proxy headers |
56 | | -6. potentially create helper methods for doing this |
| 58 | + |
| 59 | +However, if you want to get proxy response, you should use our extenion module: |
| 60 | + |
57 | 61 | ``` python |
58 | | -from python_proxy_headers import requests_adapter |
59 | | -r = requests_adapter.get('https://proxymesh.com/api/headers/', proxies={'http': 'http://de.proxymesh.com:31280', 'https': 'http://de.proxymesh.com:31280'}, proxy_headers={'x-proxymesh-ip': '46.101.236.88'}) |
| 62 | +from python_proxy_headers import aiohttp_proxy |
| 63 | +async with aiohttp_proxy.ProxyClientSession() as session: |
| 64 | + async with session.get('https://api.ipify.org?format=json', proxy="http://PROXYHOST:PORT", proxy_headers={'X-ProxyMesh-Country': 'US'}) as r: |
| 65 | + await r.text() |
| 66 | + |
60 | 67 | r.headers['X-ProxyMesh-IP'] |
61 | 68 | ``` |
62 | | -7. pass proxy response headers from urllib3 functions back to requests response |
63 | | - * tested on python3.7 & requests 2.31.0 |
64 | | - * tested with python3.12 & requests 2.32.3 |
65 | | -8. create adapters/extension for httpx library too |
66 | | - httpx Proxy class has headers attribute |
67 | | - can pass Proxy instance to HTTPTransport __init__() |
68 | | - if pass in Proxy, uses httpcore.HTTPProxy class for _pool |
69 | | - Client class can be given Proxy on __init__(), passes through to _init_proxy_transport() which creates a HTTPTransport instance |
70 | | - does not parse proxy response headers by default |
71 | | - https requests go through TunnelHTTPConnection |
72 | | - |
73 | | -passing in proxy headers works |
| 69 | + |
| 70 | +## httpx |
| 71 | + |
| 72 | +httpx also supports proxy headers by default, though it's not documented: |
| 73 | + |
74 | 74 | ``` python |
75 | 75 | import httpx |
76 | | -proxy = httpx.Proxy('http://de.proxymesh.com:31280', headers={'X-ProxyMesh-IP': '134.209.244.192'}) |
| 76 | +proxy = httpx.Proxy('http://PROXYHOST:PORT', headers={'X-ProxyMesh-Country': 'US'}) |
77 | 77 | mounts = {'http://': httpx.HTTPTransport(proxy=proxy), 'https://': httpx.HTTPTransport(proxy=proxy)} |
78 | 78 | with httpx.Client(mounts=mounts) as client: |
79 | | - r = client.get('https://proxymesh.com/api/headers/') |
| 79 | + r = client.get('https://api.ipify.org?format=json') |
80 | 80 | ``` |
81 | 81 |
|
82 | | -getting response headers works |
| 82 | +But to get the response headers, you need to use our extension module: |
| 83 | + |
83 | 84 | ``` python |
84 | 85 | import httpx |
85 | 86 | from python_proxy_headers.httpx_proxy import HTTPProxyTransport |
86 | | -proxy = httpx.Proxy('http://de.proxymesh.com:31280', headers={'X-ProxyMesh-IP': '134.209.244.192'}) |
87 | | -mounts = {'http://': HTTPProxyTransport(proxy=proxy), 'https://': HTTPProxyTransport(proxy=proxy)} |
88 | | -with httpx.Client(mounts=mounts) as client: |
89 | | - r = client.get('https://proxymesh.com/api/headers/') |
| 87 | +proxy = httpx.Proxy('http://PROXYHOST:PORT', headers={'X-ProxyMesh-Country': 'US'}) |
| 88 | +transport = HTTPProxyTransport(proxy=proxy) |
| 89 | +with httpx.Client(mounts={'http://': transort, 'https://': transport}) as client: |
| 90 | + r = client.get('https://api.ipify.org?format=json') |
90 | 91 |
|
91 | 92 | r.headers['X-ProxyMesh-IP'] |
92 | 93 | ``` |
93 | 94 |
|
94 | | -helper methods |
| 95 | +This module also provide helper methods similar to requests: |
| 96 | + |
95 | 97 | ``` python |
96 | 98 | import httpx |
97 | 99 | from python_proxy_headers import httpx_proxy |
98 | | -proxy = httpx.Proxy('http://de.proxymesh.com:31280', headers={'X-ProxyMesh-IP': '134.209.244.192'}) |
99 | | -r = httpx_proxy.get('https://proxymesh.com/api/headers/', proxy=proxy) |
| 100 | +proxy = httpx.Proxy('http://PROXYHOST:PORT', headers={'X-ProxyMesh-Country': 'US'}) |
| 101 | +r = httpx_proxy.get('https://api.ipify.org?format=json', proxy=proxy) |
100 | 102 | r.headers['X-ProxyMesh-IP'] |
101 | 103 | ``` |
102 | 104 |
|
103 | | -9. Figure out if httpx async is worth extending |
| 105 | +And finally, httpx supports async requests, so provide an async extension too: |
104 | 106 |
|
105 | 107 | ``` python |
106 | 108 | import httpx |
107 | 109 | from python_proxy_headers.httpx_proxy import AsyncHTTPProxyTransport |
108 | | -proxy = httpx.Proxy('http://de.proxymesh.com:31280', headers={'X-ProxyMesh-IP': '134.209.244.192'}) |
109 | | -mounts = {'http://': AsyncHTTPProxyTransport(proxy=proxy), 'https://': AsyncHTTPProxyTransport(proxy=proxy)} |
110 | | -async with httpx.AsyncClient(mounts=mounts) as client: |
111 | | - r = await client.get('https://proxymesh.com/api/headers/') |
| 110 | +proxy = httpx.Proxy('http://PROXYHOST:PORT', headers={'X-ProxyMesh-Country': 'US'}) |
| 111 | +transport = AsyncHTTPProxyTransport(proxy=proxy) |
| 112 | +async with httpx.AsyncClient(mounts={'http://': transport, 'https://': transport}) as client: |
| 113 | + r = await client.get('https://api.ipify.org?format=json') |
112 | 114 |
|
113 | 115 | r.headers['X-ProxyMesh-IP'] |
114 | 116 | ``` |
115 | | - |
116 | | -10. Is there a requests async library worth extending? aiohttp |
117 | | - |
118 | | -proxy headers works |
119 | | -tested with aiohttp 3.11.12 and python3.12 |
120 | | -``` python |
121 | | -from python_proxy_headers import aiohttp_proxy |
122 | | -async with aiohttp_proxy.ProxyClientSession() as session: |
123 | | - async with session.get('https://proxymesh.com/api/headers/', proxy="http://de.proxymesh.com:31280", proxy_headers={'X-ProxyMesh-IP': '46.101.236.88'}) as r: |
124 | | - await r.text() |
125 | | - |
126 | | -r.headers['X-ProxyMesh-IP'] |
127 | | -``` |
128 | | - |
129 | | -11. Update proxy-examples repository |
130 | | - |
131 | | -**TODO: rename modules to be more clear** |
0 commit comments