Skip to content

Pytest and vcrpy to improve sh.hubstorage tests #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Oct 31, 2016
Merged

Pytest and vcrpy to improve sh.hubstorage tests #32

merged 18 commits into from
Oct 31, 2016

Conversation

vshlapakov
Copy link
Contributor

@vshlapakov vshlapakov commented Oct 18, 2016

Please, review.

@vshlapakov vshlapakov self-assigned this Oct 18, 2016
@codecov-io
Copy link

codecov-io commented Oct 18, 2016

Current coverage is 47.87% (diff: 100%)

Merging #32 into master will not change coverage

@@             master        #32   diff @@
==========================================
  Files            14         14          
  Lines          1178       1178          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits            564        564          
  Misses          614        614          
  Partials          0          0          

Powered by Codecov. Last update fe134f0...29ec8f3

@vshlapakov vshlapakov force-pushed the sc786 branch 2 times, most recently from b002127 to c058106 Compare October 19, 2016 12:46
@vshlapakov
Copy link
Contributor Author

vshlapakov commented Oct 19, 2016

@chekunkov All the hubstorage tests were converted to pytest, and works perfectly with/without VCR.py. Last step is to commit all the locally generated cassetes and enable tests for sh.hs back. But these cassetes take a lot of space, about 6.5MB, so after the commit it will be very hard to review the changes itself; so I propose to review the changes, fix and merge it, and add the cassetes & enable the tests in other PR.
PS I've converted the tests to pytest, but almost not refactored it or changed its content - just made sure that tests are passing. We could take a deeper look into it here, or in the other PR.

@vshlapakov
Copy link
Contributor Author

vshlapakov commented Oct 26, 2016

@chekunkov Last changes:

  • reworked existing fixtures, used vcr decorator to have per-session cassettes
  • found a culprit of high size of the cassettes (test_project.py:test_samples test), changed it a bit, now the cassette takes 200KB instead of 2+MB
  • gzip serializer for vcr cassettes
  • py.test option --update-cassettes to run tests with real services and update cassettes

The listed things helped to reduce cassettes size to ~800KB, wonder if it's ok. Most cassettes are already small enough, but probably we could win some space by rewriting tests with largest cassettes.

220K    cassetes/test_project/test_samples.gz
68K     cassetes/test_system/test_succeed_with_close_reason.gz
64K     cassetes/test_system/test_succeed_without_close_reason.gz
64K     cassetes/test_system/test_scraper_failure.gz
24K     cassetes/test_jobq/test_summary_countstart.gz
12K     cassetes/test_project/test_bulkdata.gz

PS Not all the test modules add new jobs, so initially I added clean-jobs fixture (with remove_all_job->yield->remove_all_jobs) only to certain modules with enabled auto-use (to avoid adding the fixture to each test function header). But it leaded to code duplication (4 same fixtures) and saved only about 100KB, so I dropped it in the last commit.

@vshlapakov vshlapakov changed the title [WIP] Pytest and vcrpy to improve sh.hubstorage tests Pytest and vcrpy to improve sh.hubstorage tests Oct 26, 2016
"--update-cassettes", action="store_true", default=False,
help="make tests with real services instead of vcr cassettes")

import shutil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

Imports should be grouped in the following order:

standard library imports
related third party imports
local application/library specific imports

You should put a blank line between each group of imports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed this one, sorry!

with my_vcr.use_cassette(cassette_name):
remove_all_jobs(hsproject)
yield
remove_all_jobs(hsproject)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no need to remove all jobs after the test case, it's only needed to prepare environment for starting test case

request.function.__name__
)
with my_vcr.use_cassette(cassette_name):
remove_all_jobs(hsproject)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can reduce size of the cassettes even more by excluding this setup if we are using cassettes? It's needed only when you run tests against live servers and brings no value while using requests recorded in the cassettes. E.g. we can do this cleanup only if config.option.update_cassettes:

def pytest_addoption(parser):
parser.addoption(
"--update-cassettes", action="store_true", default=False,
help="make tests with real services instead of vcr cassettes")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding another option that doesn't drop existing cassettes, just optionally ignores them? E.g. to check tests with real services, but not override existing cassettes without the need

{'foo': 42},
{'_key': []},
{'_key': 'large_test', 'value': 'x' * 1024 ** 2},
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to use @pytest.mark.parametrize instead?

@chekunkov
Copy link
Contributor

Don't forget to add vcrpy to requirements-test.txt

# - before_record returning None means skipping all the requests
global my_vcr
my_vcr.record_mode = 'all'
my_vcr.before_record_request = lambda request: None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

collection = get_test_collection(hsproject)
clean_collection(collection)
yield collection
clean_collection(collection)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaning after the test is not necessary

remove_all_jobs(hsproject)
yield
remove_all_jobs(hsproject)
unset_testbotgroup(hsproject)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_all_jobs and unset_testbotgroup are not necessary after the test

return str(hsproject.ids.spider(TEST_SPIDER_NAME, create=1))


@my_vcr.use_cassette()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubt that you need vcr here... get_test_collection doesn't produce requests, cleanup makes sense only when it's running with real services, like in https://github.com/scrapinghub/python-scrapinghub/pull/32/files#diff-5a347e09fdac148d5d13fac9f567fc37R121

@pytest.fixture(autouse=True, scope='session')
def setup_session(hsclient, hsproject, hscollection):
set_testbotgroup(hsproject)
remove_all_jobs(hsproject)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do set_testbotgroup and remove_all_jobs make sense when running from vcrpy cassettes? if not - good candidate for vcr removal and condition like in https://github.com/scrapinghub/python-scrapinghub/pull/32/files#diff-5a347e09fdac148d5d13fac9f567fc37R121

@my_vcr.use_cassette()
@pytest.fixture(scope='session')
def hsspiderid(hsproject):
return str(hsproject.ids.spider(TEST_SPIDER_NAME, create=1))
Copy link
Contributor

@chekunkov chekunkov Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, after some thinking - let's keep it, minor thing and we cannot guarantee the same spider id in dev environment

jobq = hsproject.jobq
for queuename in ('pending', 'running', 'finished'):
info = {'summary': [None]} # poor-guy do...while
while info['summary']:
Copy link
Contributor

@chekunkov chekunkov Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this is not Pythonic :P

while True:
    info = jobq.summary(queuename)
    if not info['summary']:
        break
    # ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sure! I just moved the function as-is and didn't notice it, thanks

help="test with real services rewriting existing vcr cassettes")
parser.addoption(
"--ignore-cassettes", action="store_true", default=False,
help="test with real services skipping existing vcr cassettes")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this has to be added to the tests/conftest.py, otherwise it doesn't work with pytest tests

@vshlapakov vshlapakov merged commit 8185341 into master Oct 31, 2016
@vshlapakov vshlapakov deleted the sc786 branch October 31, 2016 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants