-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
TST: Refactor test_parsers.py #12964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure if this is all the refactoring that needs to be done, but I think I got through a good portion of it. Further suggestions are welcome! |
9846686
to
0a7f5e6
Compare
0a7f5e6
to
ecf4cde
Compare
changing things like this:
|
it is really quite difficult to tell what you did and comment on it. I would split this up into multiple files ( e.g.
|
Ah, yes, unfortunately, I did not realize how much I had changed until I saw that GitHub could not show the diff at all. Also, your code block is blank if you were trying to demonstrate something... |
@gfyoung what I was saying (I edited), was the |
Hmmm...I'll see what I can do to break up the tests. It's not as straightforward as it is in |
Also, this PR is pretty much just reorganization and removal of duplicate tests. If you can view it, there are no more tests under |
Also, one more thing: why do we have |
@gfyoung my point is this is very hard to see. Let's break up first. Yes no realy need for a separate This IS straightforward to breakup if you just have different test classes rather than a single monolith. The idea is to test different options in different files. Perfect example is compressions which is alredy split out. But you can split out parse_date testing. line terminator testsing etc. Lots of ways to do this. |
@jreback : No, I perfectly understand (I realized that it might be a problem when I saw GitHub couldn't show it)! I was just trying to get some feedback first on how to break up |
ecf4cde
to
d439b47
Compare
Current coverage is 83.92%@@ master #12964 diff @@
==========================================
Files 136 136
Lines 49994 49918 -76
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 42028 41889 -139
- Misses 7966 8029 +63
Partials 0 0
|
d439b47
to
b886475
Compare
@@ -0,0 +1,471 @@ | |||
""" | |||
Tests date parsing functionality for all of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/io/tests/parser/test_dates.py
should be the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then won't that file be run as a test suite? I don't want that (it needs to be imported as a test class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well you need to fix it then if that's the way you want it. the correct way to do this is to split off the base class into common.py
or something which is then imported into each file. Look at how tests/indexes
was done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay, I see. I'll take a look into that then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tricky part (compared to tests/indexes
) is that there are really "three" base classes (CParserHighMemory
, CParserLowMemory
, PythonParser
). How does one organize that then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idea is then you can easily strip out related things and add as separeat files in a nice way. (and don't crows the namespace)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the main Base classes mostly empty. Then just sub-class them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH it seems easier to work in the way of making the individual test suites part of a common
package and sub-class them into each of the main parser tests as is done now. Otherwise, I fear there will be duplicate code IINM. For example, parse_date_tests.py
needs to be run for each of three parsers. But I can't just throw them all in as super-classes because they will clash. So I would then have to make three different versions, one for each parser, which I don't think is preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simpliest easiest is best. you can also make them mix-in's might be easier. but still keep in separate files (just don't name then test_*
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I'll try to first separate them out and then we can see what is best afterwards (still WIP with pulling out test suites from test_parsers
)
15f50f2
to
6a26224
Compare
@@ -0,0 +1,155 @@ | |||
# -*- coding: utf-8 -*- | |||
|
|||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think naming this compression.py
is better because then its not runable on its own (its only mixins to others). name test_*
if they are runnable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bfbbed2
to
4f9c911
Compare
# -*- coding: utf-8 -*- | ||
|
||
""" | ||
Tests parsers ability to read and parse non-local files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so some of the files are not in the pandas/io/tests/parser/*
namespace e.g.
pandas/io/tests/test_network.py
as an example.
[jreback-~/pandas] git diff master --stat
pandas/io/tests/parser/__init__.py | 0
pandas/io/tests/parser/compression.py | 155 +++++
pandas/io/tests/parser/parse_dates.py | 468 +++++++++++++++
pandas/io/tests/parser/test_parsers.py | 3342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
pandas/io/tests/parser/test_textreader.py | 402 +++++++++++++
pandas/io/tests/test_cparser.py | 401 -------------
pandas/io/tests/test_network.py | 195 +++++++
pandas/io/tests/test_parsers.py | 5055 --------------------------------------------------------------------------------------------------------------------------------------------------------------
pandas/io/tests/test_read_fwf.py | 337 +++++++++++
pandas/io/tests/test_unsupported.py | 88 +++
pandas/tests/test_lib.py | 74 ++-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay, I'll move all of those (excluding test_lib
of course) into the parser
directory as well
I think this also needs an edit to e.g. add you can do a
|
4f9c911
to
edba8fa
Compare
assert_same_values_and_dtype(result, expected) | ||
|
||
|
||
def assert_same_values_and_dtype(res, exp): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put at the top
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
fc49f1a
to
818f711
Compare
something wrong. lots of tests not running
but if its explict it works
I think that
|
818f711
to
225637f
Compare
@jreback : I think you're right about that (at least that's what Google says). I changed the mode of |
225637f
to
20389b1
Compare
I think need to add |
|
|
Huh? What do you mean? That's the exact command |
so why didn't your build fail then? |
I haven't pushed |
oh, ok, then pls push. I think this is ready as is. (pls flake check) |
20389b1
to
c01b4bd
Compare
Refactored tests in test_parsers.py to increase coverage of the different types of parsers and remove nearly duplicate testing in some cases.
c01b4bd
to
3ed01a0
Compare
@jreback : Travis is giving the green light. Not sure what's going on with Codecov though. Is this good to merge? |
hmm, yeah the codecov that is looking odd |
So how to proceed? Can this be merged? Or is it a bug in |
yeah will take another look. if you have insite into codecov, then ok! |
Unfortunately, I do not :< . Let me know once you have more info! |
thank you sir! |
I had to do this: 879f217 to make the setup work in a clean env. Travis works from a clone so the installation is slightly different (it depends on setup). Side issue, this came up because it was segfaulting on |
Ah, I see. Thanks! |
Refactored tests in
test_parsers.py
to increase coverage of the different types of parsers and remove nearly duplicate testing in some cases.