Description
I am trying to use proxy.py in embed non-block mode to help collect data from web, however I found it can only work with non-embedded mode but failed with embedded mode.
Below is my test code (run on windows10, python 3.7.9, proxy.py v2.4.3, the code is revised from the article https://webelement.click/en/four_simple_steps_to_add_custom_http_headers_in_selenium_webdriver_tests_in_python)
from selenium.webdriver.firefox.options import Options
import proxy
import header_modifier
selenium_proxy = webdriver.Proxy()
from proxy.common import utils
proxy_port = 8899
proxy_url = '127.0.0.1:' + str(proxy_port)
with proxy.Proxy(
['--host', '127.0.0.1',
'--port', str(proxy_port),
'--num-workers', 1,
'--log-level','d',
'--ca-cert-file', '/test/ws-ca.pem',
'--ca-key-file', '/test/ws-ca.key',
'--ca-signing-key-file', '/test/ws-signing.key'],
plugins=
[b'header_modifier.BasicAuthorizationPlugin',
header_modifier.BasicAuthorizationPlugin]):
from selenium.webdriver.common.proxy import ProxyType
selenium_proxy.proxyType = ProxyType.AUTODETECT
from selenium.webdriver import DesiredCapabilities
capabilities = DesiredCapabilities.FIREFOX
selenium_proxy.add_to_capabilities(capabilities)
options = Options()
options.headless = True
options.add_argument('--proxy-server=%s' % proxy_url)
driver = webdriver.Firefox(options=options,capabilities=capabilities)
driver.get('https://www.webelement.click/stand/basic?lang=en')
time.sleep(5)
assert driver.find_element(By.TAG_NAME, 'h2').text == 'You have authorized successfully!'
driver.quit()
When I run above code, it will produce below message:
2022-10-01 23:29:28,478 - pid:69024 [I] plugins.load:85 - Loaded plugin proxy.http.proxy.HttpProxyPlugin
2022-10-01 23:29:28,478 - pid:69024 [I] plugins.load:85 - Loaded plugin header_modifier.BasicAuthorizationPlugin
2022-10-01 23:29:28,478 - pid:69024 [I] plugins.load:85 - Loaded plugin main.BasicAuthorizationPlugin
2022-10-01 23:29:28,484 - pid:69024 [I] tcp.listen:82 - Listening on 127.0.0.1:8899
2022-10-01 23:29:28,713 - pid:69024 [D] pool._start:151 - Started acceptor#0 process 62732
2022-10-01 23:29:28,713 - pid:69024 [I] pool.setup:108 - Started 1 acceptors in threaded mode
2022-10-01 23:29:28,715 - pid:69024 [I] pool.shutdown:125 - Shutting down 1 acceptors
2022-10-01 23:29:29,869 - pid:62732 [D] acceptor.run:182 - Acceptor#0 shutdown
2022-10-01 23:29:29,949 - pid:69024 [D] pool.shutdown:130 - Acceptors shutdown
...
please notice that the acceptor was shutdown soon just after started.
To verify the settings, I break the process to two steps:
the first step is to start proxy in standard (non-embed) mode as below, i.e. start it in commad line like (need to add path of header_modifier.py to PYTHONPATH first):
proxy --host 127.0.0.1 --port 8899 --num-workers 1 --log-level d --ca-cert-file /test/ws-ca.pem --ca-key-file /test/ws-ca.key --ca-signing-key-file /test/ws-signing.key --plugins header_modifier.BasicAuthorizationPlugin
The proxy looks work well and not terminated soon:
2022-10-01 23:21:17,402 - pid:70548 [I] plugins.load:85 - Loaded plugin proxy.http.proxy.HttpProxyPlugin
2022-10-01 23:21:17,403 - pid:70548 [I] plugins.load:85 - Loaded plugin header_modifier.BasicAuthorizationPlugin
2022-10-01 23:21:17,410 - pid:70548 [I] tcp.listen:82 - Listening on 127.0.0.1:8899
2022-10-01 23:21:17,802 - pid:70548 [D] pool._start:151 - Started acceptor#0 process 44324
2022-10-01 23:21:17,802 - pid:70548 [I] pool.setup:108 - Started 1 acceptors in threaded mode
the second step is then to test the proxy by below code:
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.proxy import ProxyType
selenium_proxy = webdriver.Proxy()
from selenium.webdriver.common.proxy import ProxyType
selenium_proxy.proxyType = ProxyType.AUTODETECT
from selenium.webdriver import DesiredCapabilities
capabilities = DesiredCapabilities.FIREFOX
selenium_proxy.add_to_capabilities(capabilities)
options = Options()
options.headless = True
proxy_port = 8899
proxy_url = '127.0.0.1:' + str(proxy_port)
#options.add_argument('--proxy-server=%s' % proxy)
options.add_argument("--proxy-server=http://{0}".format(proxy_url))
driver = webdriver.Firefox(options=options,capabilities=capabilities)
#driver.maximize_window()
driver.get('https://www.webelement.click/stand/basic?lang=en')
assert driver.find_element(By.CSS_SELECTOR, '.post-body h2').text == 'You have authorized successfully!'
the test results show that it can work as expected.
So it seems that standard mode can work but embed non-block mode dosen't work, can any one help on this issue?
Thanks!,
Leo
p.s. I also copied the sample plugin code as below (save as header_modifier.py) from reference site:
from proxy.http.proxy import HttpProxyBasePlugin
from proxy.http.parser import HttpParser
from typing import Optional
import base64
class BasicAuthorizationPlugin(HttpProxyBasePlugin):
"""Modifies request headers."""
def before_upstream_connection(
self, request: HttpParser) -> Optional[HttpParser]:
return request
def handle_client_request(
self, request: HttpParser) -> Optional[HttpParser]:
basic_auth_header = 'Basic ' + base64.b64encode('webelement:click'.encode('utf-8')).decode('utf-8')
request.add_header('Authorization'.encode('utf-8'), basic_auth_header.encode('utf-8'))
return request
def on_upstream_connection_close(self) -> None:
pass
def handle_upstream_chunk(self, chunk: memoryview) -> memoryview:
return chunk