Skip to content

CLN: Split up Boolean array tests #32780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 21, 2020
Merged

CLN: Split up Boolean array tests #32780

merged 14 commits into from
Mar 21, 2020

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Mar 17, 2020

Apologies for the size of this PR, but it looks like the ExtensionArray tests sometimes have their own directories by type, and other times all tests for the type are in a single file (e.g., https://github.com/pandas-dev/pandas/tree/master/pandas/tests/arrays/categorical vs. https://github.com/pandas-dev/pandas/blob/master/pandas/tests/arrays/test_boolean.py). I think the "whole directory for type" structure makes it easier to find and add new tests, so I'm doing that here for the Boolean tests. There shouldn't be any changes other than moving things around.

@dsaxton dsaxton changed the title REF: Split up Boolean array tests CLN: Split up Boolean array tests Mar 17, 2020
return pd.array(make_data(), dtype=dtype)


class TestComparisonOps(BaseOpsUtil):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be parametrized over EA/Series/DataFrame? (and eventually ExtensionIndex)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, I'm trying to avoid making any actual logic changes to tests here though since the PR is already quite big

from pandas.tests.extension.base import BaseOpsUtil


class TestLogicalOps(BaseOpsUtil):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side comment: do we need these classes in general? Seems like the location / name of the test should give enough context

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, if there is only a single class in a file, certainly not. But in this case, it might use things from the BaseOpsUtil class

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed with @jorisvandenbossche. Largely its a matter of personal preference; I tend to retain the class structure in cases where it helps further organize tests

@jorisvandenbossche
Copy link
Member

Personally, I don't find that this makes it easier to find the tests, though ... (with this I would need to look in multiple files of which I don't know the name)

Now, I assume that if we keep adding tests here, splitting it becomes unavoidable.

The question might then also be: where to put the line on what should be tested here?
For example, @jbrockmendel mentioned parametrizing the op tests to also test with series and dataframe. But then those tests maybe rather belong in one of the other test modules that are more specific, like tests/arithmetic, tests/reductions/, ...?

from pandas.tests.extension.base import BaseOpsUtil


class TestLogicalOps(BaseOpsUtil):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, if there is only a single class in a file, certainly not. But in this case, it might use things from the BaseOpsUtil class

import pandas as pd
import pandas._testing as tm


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also still combine the different ops in a single file?

@simonjayhawkins simonjayhawkins added the Testing pandas testing functions or related to the test suite label Mar 18, 2020
@jbrockmendel
Copy link
Member

mentioned parametrizing the op tests to also test with series and dataframe. But then those tests maybe rather belong in one of the other test modules that are more specific, like tests/arithmetic, tests/reductions/, ...?

Yes, if that was unclear: if the arithmetic/reduction tests can/should be parametrized over EA/Series/DataFrame, then they likely should be moved to tests.arithmetic or tests.reductions etc. (the situation is slightly complicated by the fact that those tests are also parametrized over Index which these can't be until ExtensionIndex is in operation)

@dsaxton
Copy link
Member Author

dsaxton commented Mar 20, 2020

Curious if @jreback has any preferences on the layout of these tests?

@jreback jreback added this to the 1.1 milestone Mar 21, 2020
@jreback jreback merged commit 447a595 into pandas-dev:master Mar 21, 2020
@jreback
Copy link
Contributor

jreback commented Mar 21, 2020

once a file gets > 500 lines, splitting into sub-modules is really needed. we have already split most other tests, so this is fine.

@dsaxton dsaxton deleted the ref-array-test branch March 21, 2020 21:14
SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
jbrockmendel pushed a commit to jbrockmendel/pandas that referenced this pull request Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants