Skip to content

TST: Add test with large shape to check_below_min_count #46534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

FactorizeD
Copy link
Contributor

@FactorizeD FactorizeD commented Mar 27, 2022

The function in question didn't have any tests in the first place so I thought that I will add them and simply use the problematic tuple for shape argument.

@jreback jreback added Testing pandas testing functions or related to the test suite Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Mar 28, 2022
@jreback
Copy link
Contributor

jreback commented Mar 28, 2022

pls rebase, this is failing on windows

@FactorizeD FactorizeD force-pushed the bug-add-test-showing-np-prod-working-fine branch from 1e944f4 to 2f97a96 Compare March 28, 2022 17:44
@jreback jreback added this to the 1.5 milestone Mar 28, 2022
@jreback
Copy link
Contributor

jreback commented Mar 28, 2022

on 32-bit (some other builds as well i think)

> =================================== FAILURES ===================================
_________ test_check_below_min_count__positive_min_count[1-False-None] _________

mask = None, min_count = 1, expected_result = False

    @pytest.mark.parametrize(
        "mask", [None, np.array([False, False, True]), np.array([True] + 2244366 * [False])]
    )
    @pytest.mark.parametrize("min_count, expected_result", [(1, False), (2812191852, True)])
    def test_check_below_min_count__positive_min_count(mask, min_count, expected_result):
        # shape taken to show that the issue with GH35227 is fixed
        shape = (2244367, 1253)
        result = nanops.check_below_min_count(shape, mask, min_count)
>       assert result == expected_result
E       assert True == False

@FactorizeD
Copy link
Contributor Author

Right, didn't test for 32-bit, my bad. I will work on it later this week. BTW, I am still new to the pandas testing process, is there any way to check 32-bit builds apart from configuring continuous integration tests as specified in https://pandas.pydata.org/docs/development/contributing_codebase.html#testing-with-continuous-integration?

@FactorizeD FactorizeD marked this pull request as draft March 29, 2022 23:03
@FactorizeD FactorizeD force-pushed the bug-add-test-showing-np-prod-working-fine branch from 2f97a96 to 36e9cc3 Compare April 3, 2022 22:03
@FactorizeD FactorizeD force-pushed the bug-add-test-showing-np-prod-working-fine branch from 36e9cc3 to 81da249 Compare April 5, 2022 21:30
@FactorizeD FactorizeD marked this pull request as ready for review April 5, 2022 22:40
@FactorizeD
Copy link
Contributor Author

All should be good now - it took me a while to figure out that np.prod will evaluate the dtype to 32 bits on all Windows machines (even 64 bit ones). In the latest version we have three - 2 running on all platforms and 1 running on 64 bit non-windows ones (it is the one with large shape that was initially causing the issue)

@jreback
Copy link
Contributor

jreback commented Apr 5, 2022

All should be good now - it took me a while to figure out that np.prod will evaluate the dtype to 32 bits on all Windows machines (even 64 bit ones). In the latest version we have three - 2 running on all platforms and 1 running on 64 bit non-windows ones (it is the one with large shape that was initially causing the issue)

hmm yeah you can specify np.prod(..., dtype='float64') but directly yeah on windows this would easily overflow.

thanks

@jreback jreback merged commit 3a02286 into pandas-dev:main Apr 5, 2022
@jreback
Copy link
Contributor

jreback commented Apr 5, 2022

thanks @FactorizeD

@FactorizeD FactorizeD deleted the bug-add-test-showing-np-prod-working-fine branch April 7, 2022 19:20
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: ValueError: Cannot convert non-finite values (NA or inf) to integer only when DF exceed certain size
2 participants