Skip to content

Commit f084342

Browse files
authored
jupyter notebook based tutorial (#1059)
* `jupyter` notebook based tutorial * Move within `tutorial` directory * Fix spell * Add `as_non_blocking` option during wrap * `as_non_blocking=False` by defaut
1 parent 4af0c2f commit f084342

File tree

9 files changed

+394
-8
lines changed

9 files changed

+394
-8
lines changed

Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ lib-clean:
9494
rm -rf .hypothesis
9595
# Doc RST files are cached and can cause issues
9696
# See https://github.com/abhinavsingh/proxy.py/issues/642#issuecomment-1003444578
97-
rm docs/pkg/*.rst
97+
rm -f docs/pkg/*.rst
9898

9999
lib-dep:
100100
pip install --upgrade pip && \
@@ -134,7 +134,7 @@ lib-doc:
134134
python -m tox -e build-docs && \
135135
$(OPEN) .tox/build-docs/docs_out/index.html || true
136136

137-
lib-coverage:
137+
lib-coverage: lib-clean
138138
pytest --cov=proxy --cov=tests --cov-report=html tests/ && \
139139
$(OPEN) htmlcov/index.html || true
140140

proxy/core/connection/server.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,12 @@ def connect(
4343
)
4444
self.closed = False
4545

46-
def wrap(self, hostname: str, ca_file: Optional[str] = None) -> None:
46+
def wrap(
47+
self,
48+
hostname: str,
49+
ca_file: Optional[str] = None,
50+
as_non_blocking: bool = False,
51+
) -> None:
4752
ctx = ssl.create_default_context(
4853
ssl.Purpose.SERVER_AUTH, cafile=ca_file,
4954
)
@@ -54,4 +59,5 @@ def wrap(self, hostname: str, ca_file: Optional[str] = None) -> None:
5459
self.connection,
5560
server_hostname=hostname,
5661
)
57-
self.connection.setblocking(False)
62+
if as_non_blocking:
63+
self.connection.setblocking(False)

proxy/http/proxy/server.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -762,7 +762,11 @@ def wrap_server(self) -> bool:
762762
assert isinstance(self.upstream.connection, socket.socket)
763763
do_close = False
764764
try:
765-
self.upstream.wrap(text_(self.request.host), self.flags.ca_file)
765+
self.upstream.wrap(
766+
text_(self.request.host),
767+
self.flags.ca_file,
768+
as_non_blocking=True,
769+
)
766770
except ssl.SSLCertVerificationError: # Server raised certificate verification error
767771
# When --disable-interception-on-ssl-cert-verification-error flag is on,
768772
# we will cache such upstream hosts and avoid intercepting them for future

proxy/http/server/reverse.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ def handle_request(self, request: HttpParser) -> None:
8383
text_(
8484
self.choice.hostname,
8585
),
86+
as_non_blocking=True,
8687
)
8788
# Update Host header
8889
# if request.has_header(b'Host'):

tests/http/proxy/test_http_proxy_tls_interception.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,8 +72,8 @@ def mock_connection() -> Any:
7272

7373
# Do not mock the original wrap method
7474
self.mock_server_conn.return_value.wrap.side_effect = \
75-
lambda x, y: TcpServerConnection.wrap(
76-
self.mock_server_conn.return_value, x, y,
75+
lambda x, y, as_non_blocking: TcpServerConnection.wrap(
76+
self.mock_server_conn.return_value, x, y, as_non_blocking=as_non_blocking,
7777
)
7878

7979
type(self.mock_server_conn.return_value).connection = \

tests/plugin/test_http_proxy_plugins_with_tls_interception.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,9 @@ def mock_connection() -> Any:
9595

9696
# Do not mock the original wrap method
9797
self.server.wrap.side_effect = \
98-
lambda x, y: TcpServerConnection.wrap(self.server, x, y)
98+
lambda x, y, as_non_blocking: TcpServerConnection.wrap(
99+
self.server, x, y, as_non_blocking=as_non_blocking,
100+
)
99101

100102
self.server.has_buffer.side_effect = has_buffer
101103
type(self.server).closed = mocker.PropertyMock(side_effect=closed)

tutorial/connections.ipynb

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Connections\n",
8+
"\n",
9+
"## Buffered Socket Connections\n",
10+
"\n",
11+
"`proxy.py` core provides buffered socket implementations. In most of the cases, a buffered connection will be desired. With buffered connections, we can queue data from our application code while leaving the responsibility of flushing the buffer on the core.\n",
12+
"\n",
13+
"One of the buffered connection class is `TcpServerConnection`, which manages connection to an upstream server. Optionally, we can also enable encryption _(TLS)_ before communicating with the server.\n",
14+
"\n",
15+
"Import the following:"
16+
]
17+
},
18+
{
19+
"cell_type": "code",
20+
"execution_count": 3,
21+
"metadata": {},
22+
"outputs": [],
23+
"source": [
24+
"from proxy.core.connection import TcpServerConnection\n",
25+
"from proxy.common.utils import build_http_request\n",
26+
"from proxy.http.methods import httpMethods\n",
27+
"from proxy.http.parser import HttpParser, httpParserTypes\n",
28+
"\n",
29+
"request = build_http_request(\n",
30+
" method=httpMethods.GET,\n",
31+
" url=b'/',\n",
32+
" headers={\n",
33+
" b'Host': b'jaxl.com',\n",
34+
" }\n",
35+
")"
36+
]
37+
},
38+
{
39+
"cell_type": "markdown",
40+
"metadata": {},
41+
"source": [
42+
"Let's use `TcpServerConnection` to make a HTTP web server request."
43+
]
44+
},
45+
{
46+
"cell_type": "code",
47+
"execution_count": null,
48+
"metadata": {},
49+
"outputs": [],
50+
"source": [
51+
"http_client = TcpServerConnection('jaxl.com', 80)\n",
52+
"http_client.connect()\n",
53+
"http_client.queue(memoryview(request))\n",
54+
"http_client.flush()\n",
55+
"\n",
56+
"http_response = HttpParser(httpParserTypes.RESPONSE_PARSER)\n",
57+
"while not http_response.is_complete:\n",
58+
" http_response.parse(http_client.recv().tobytes())\n",
59+
"http_client.close()\n",
60+
"\n",
61+
"print(http_response.build_response())\n",
62+
"\n",
63+
"assert http_response.is_complete\n",
64+
"assert http_response.code == b'301'\n",
65+
"assert http_response.reason == b'Moved Permanently'\n",
66+
"assert http_response.has_header(b'location')\n",
67+
"assert http_response.header(b'location') == b'https://jaxl.com/'"
68+
]
69+
},
70+
{
71+
"cell_type": "markdown",
72+
"metadata": {},
73+
"source": [
74+
"Let's use `TcpServerConnection` to make a HTTPS web server request."
75+
]
76+
},
77+
{
78+
"cell_type": "code",
79+
"execution_count": null,
80+
"metadata": {},
81+
"outputs": [],
82+
"source": [
83+
"https_client = TcpServerConnection('jaxl.com', 443)\n",
84+
"https_client.connect()\n",
85+
"https_client.wrap(hostname='jaxl.com')\n",
86+
"\n",
87+
"https_client.queue(memoryview(request))\n",
88+
"https_client.flush()\n",
89+
"\n",
90+
"https_response = HttpParser(httpParserTypes.RESPONSE_PARSER)\n",
91+
"while not https_response.is_complete:\n",
92+
" https_response.parse(https_client.recv().tobytes())\n",
93+
"https_client.close()\n",
94+
"\n",
95+
"print(https_response.build_response())\n",
96+
"\n",
97+
"assert https_response.is_complete\n",
98+
"assert https_response.code == b'200'\n",
99+
"assert https_response.reason == b'OK'\n",
100+
"assert https_response.has_header(b'content-type')\n",
101+
"assert https_response.header(b'content-type') == b'text/html'"
102+
]
103+
}
104+
],
105+
"metadata": {
106+
"interpreter": {
107+
"hash": "da9d6927d62b2b95bde149eedfbd5367cb7f465aad65a736f49c99ee3db39df7"
108+
},
109+
"kernelspec": {
110+
"display_name": "Python 3.10.0 64-bit ('venv310': venv)",
111+
"language": "python",
112+
"name": "python3"
113+
},
114+
"language_info": {
115+
"codemirror_mode": {
116+
"name": "ipython",
117+
"version": 3
118+
},
119+
"file_extension": ".py",
120+
"mimetype": "text/x-python",
121+
"name": "python",
122+
"nbconvert_exporter": "python",
123+
"pygments_lexer": "ipython3",
124+
"version": "3.10.0"
125+
},
126+
"orig_nbformat": 4
127+
},
128+
"nbformat": 4,
129+
"nbformat_minor": 2
130+
}

tutorial/http_parser.ipynb

Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# HttpParser\n",
8+
"\n",
9+
"`HttpParser` class is at the heart of everything related to HTTP. It is used by Web server and Proxy server core and their plugin eco-system. As the name suggests, it is capable of parsing both HTTP request and response packets. It can also parse HTTP look-a-like protocols like ICAP, SIP etc. Most importantly, remember that `HttpParser` was originally written to handle HTTP packets arriving in the context of a proxy server and till date its default behavior favors the same flavor.\n",
10+
"\n",
11+
"Let's start by parsing a HTTP web request using `HttpParser`"
12+
]
13+
},
14+
{
15+
"cell_type": "code",
16+
"execution_count": 1,
17+
"metadata": {},
18+
"outputs": [
19+
{
20+
"name": "stdout",
21+
"output_type": "stream",
22+
"text": [
23+
"b'GET / HTTP/1.1\\r\\nHost: jaxl.com\\r\\n\\r\\n'\n"
24+
]
25+
}
26+
],
27+
"source": [
28+
"from proxy.http.methods import httpMethods\n",
29+
"from proxy.http.parser import HttpParser, httpParserTypes, httpParserStates\n",
30+
"from proxy.common.constants import HTTP_1_1\n",
31+
"\n",
32+
"get_request = HttpParser(httpParserTypes.REQUEST_PARSER)\n",
33+
"get_request.parse(b'GET / HTTP/1.1\\r\\nHost: jaxl.com\\r\\n\\r\\n')\n",
34+
"\n",
35+
"print(get_request.build())\n",
36+
"\n",
37+
"assert get_request.is_complete\n",
38+
"assert get_request.method == httpMethods.GET\n",
39+
"assert get_request.version == HTTP_1_1\n",
40+
"assert get_request.host == None\n",
41+
"assert get_request.port == 80\n",
42+
"assert get_request._url != None\n",
43+
"assert get_request._url.remainder == b'/'\n",
44+
"assert get_request.has_header(b'host')\n",
45+
"assert get_request.header(b'host') == b'jaxl.com'\n",
46+
"assert len(get_request.headers) == 1"
47+
]
48+
},
49+
{
50+
"cell_type": "markdown",
51+
"metadata": {},
52+
"source": [
53+
"Next, let's parse a HTTP proxy request using `HttpParser`"
54+
]
55+
},
56+
{
57+
"cell_type": "code",
58+
"execution_count": 2,
59+
"metadata": {},
60+
"outputs": [
61+
{
62+
"name": "stdout",
63+
"output_type": "stream",
64+
"text": [
65+
"b'GET / HTTP/1.1\\r\\nHost: jaxl.com\\r\\n\\r\\n'\n",
66+
"b'GET http://jaxl.com:80/ HTTP/1.1\\r\\nHost: jaxl.com\\r\\n\\r\\n'\n"
67+
]
68+
}
69+
],
70+
"source": [
71+
"proxy_request = HttpParser(httpParserTypes.REQUEST_PARSER)\n",
72+
"proxy_request.parse(b'GET http://jaxl.com/ HTTP/1.1\\r\\nHost: jaxl.com\\r\\n\\r\\n')\n",
73+
"\n",
74+
"print(proxy_request.build())\n",
75+
"print(proxy_request.build(for_proxy=True))\n",
76+
"\n",
77+
"assert proxy_request.is_complete\n",
78+
"assert proxy_request.method == httpMethods.GET\n",
79+
"assert proxy_request.version == HTTP_1_1\n",
80+
"assert proxy_request.host == b'jaxl.com'\n",
81+
"assert proxy_request.port == 80\n",
82+
"assert proxy_request._url != None\n",
83+
"assert proxy_request._url.remainder == b'/'\n",
84+
"assert proxy_request.has_header(b'host')\n",
85+
"assert proxy_request.header(b'host') == b'jaxl.com'\n",
86+
"assert len(proxy_request.headers) == 1"
87+
]
88+
},
89+
{
90+
"cell_type": "markdown",
91+
"metadata": {},
92+
"source": [
93+
"Notice how `proxy_request.build()` and `proxy_request.build(for_proxy=True)` behave. Also, notice how `proxy_request.host` field is populated for a HTTP proxy packet but not for the prior HTTP web request packet example.\n",
94+
"\n",
95+
"To conclude, let's parse a HTTPS proxy request"
96+
]
97+
},
98+
{
99+
"cell_type": "code",
100+
"execution_count": 4,
101+
"metadata": {},
102+
"outputs": [
103+
{
104+
"name": "stdout",
105+
"output_type": "stream",
106+
"text": [
107+
"b'CONNECT / HTTP/1.1\\r\\nHost: jaxl.com:443\\r\\n\\r\\n'\n",
108+
"b'CONNECT jaxl.com:443 HTTP/1.1\\r\\nHost: jaxl.com:443\\r\\n\\r\\n'\n"
109+
]
110+
}
111+
],
112+
"source": [
113+
"connect_request = HttpParser(httpParserTypes.REQUEST_PARSER)\n",
114+
"connect_request.parse(b'CONNECT jaxl.com:443 HTTP/1.1\\r\\nHost: jaxl.com:443\\r\\n\\r\\n')\n",
115+
"\n",
116+
"print(connect_request.build())\n",
117+
"print(connect_request.build(for_proxy=True))\n",
118+
"\n",
119+
"assert connect_request.is_complete\n",
120+
"assert connect_request.is_https_tunnel\n",
121+
"assert connect_request.version == HTTP_1_1\n",
122+
"assert connect_request.host == b'jaxl.com'\n",
123+
"assert connect_request.port == 443\n",
124+
"assert connect_request._url != None\n",
125+
"assert connect_request._url.remainder == None\n",
126+
"assert connect_request.has_header(b'host')\n",
127+
"assert connect_request.header(b'host') == b'jaxl.com:443'\n",
128+
"assert len(connect_request.headers) == 1"
129+
]
130+
},
131+
{
132+
"cell_type": "markdown",
133+
"metadata": {},
134+
"source": [
135+
"### Take Away\n",
136+
"\n",
137+
"- `host` and `port` attributes represent the `host:port` present in the HTTP packet request line. For comparison purposes, below are all the three request lines again. Notice how there is no `host:port` available only for the web server get request\n",
138+
" | Request Type | Request Line |\n",
139+
" | ------------------| ---------------- |\n",
140+
" | `get_request` | `GET / HTTP/1.1` |\n",
141+
" | `proxy_request` | `GET http://jaxl.com HTTP/1.1` |\n",
142+
" | `connect_request` | `CONNECT jaxl.com:443 HTTP/1.1` |\n",
143+
"- `_url` attribute is an instance of `Url` parser and contains parsed information about the URL found in the request line\n",
144+
"\n",
145+
"Few of the other handy properties within `HttpParser` are:\n",
146+
"\n",
147+
"- `is_complete`\n",
148+
"- `is_http_1_1_keep_alive`\n",
149+
"- `is_connection_upgrade`\n",
150+
"- `is_https_tunnel`\n",
151+
"- `is_chunked_encoded`\n",
152+
"- `content_expected`\n",
153+
"- `body_expected`"
154+
]
155+
}
156+
],
157+
"metadata": {
158+
"interpreter": {
159+
"hash": "da9d6927d62b2b95bde149eedfbd5367cb7f465aad65a736f49c99ee3db39df7"
160+
},
161+
"kernelspec": {
162+
"display_name": "Python 3.10.0 64-bit ('venv310': venv)",
163+
"language": "python",
164+
"name": "python3"
165+
},
166+
"language_info": {
167+
"codemirror_mode": {
168+
"name": "ipython",
169+
"version": 3
170+
},
171+
"file_extension": ".py",
172+
"mimetype": "text/x-python",
173+
"name": "python",
174+
"nbconvert_exporter": "python",
175+
"pygments_lexer": "ipython3",
176+
"version": "3.10.0"
177+
},
178+
"orig_nbformat": 4
179+
},
180+
"nbformat": 4,
181+
"nbformat_minor": 2
182+
}

0 commit comments

Comments
 (0)