Skip to content

Support all possible time units for Series.astype literals #562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 4, 2023

Conversation

gandhis1
Copy link
Contributor

@gandhis1 gandhis1 commented Mar 3, 2023

@Dr-Irv FYI @ramvikrams @skatsuta

  • Tests added: Please use assert_type() to assert the type of any return value

@@ -1571,12 +1571,22 @@ def test_updated_astype() -> None:
np.complex128,
)

# Check a couple of selected time units, but not all, to avoid excessive test cases
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this as follows:

if TYPE_CHECKING:
    from pandas._typing import TimedeltaDtypeArg

tdtypes: list["TimedeltaDtypeArg"] = [ "timedelta64[Y]",  # annotation ignored at runtime
    "timedelta64[M]", # etc ]
for tdt in tdtypes:
    check(
        assert_type(s.astype(tdt), TimedeltaSeries),
        pd.Series,
        Timedelta,
    )

The type checkers should then see that tdt is valid to pass, but, at least from my point of view, we can then see that all of those types are accepted at runtime when we do pytest.

Unfortunate part is that you have to copy all those values from _typing.pyi to test_series.py, but it still accomplishes the goal of testing them all without having to write a test for each one.

Can you try that out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just pushed an update which does this, and avoids the duplication by using get_args and casting to a tuple[TimedeltaDtypeArg, ...]. Unless I'm mistaken, I believe that would be functionally equivalent to what you have above.

That said, I'm skeptical this is a comprehensive test. All the type assertion is checking is the type alias, not what is part of that type alias. Don't we need to check standalone string literals in addition to whether the runtime behavior matches the annotation?

If that's the case and we want to check everything, then I can just create all the individual test cases. It seems like the # of test cases isn't really a concern.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I'm skeptical this is a comprehensive test. All the type assertion is checking is the type alias, not what is part of that type alias. Don't we need to check standalone string literals in addition to whether the runtime behavior matches the annotation?

If that's the case and we want to check everything, then I can just create all the individual test cases. It seems like the # of test cases isn't really a concern.

The problem with using cast and get_args() is that you have everything inside of a if TYPE_CHECKING: block, so we are missing the benefit of making sure that the types we have in the list work at runtime. And we can't import TimedeltaDtypeArg at runtime, because it only exists in the stubs. In other words, the code you wrote never gets executed by pytest.

So we have to either do each test explicitly for each string, or just duplicate the set of possible strings in test_series.py

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have get_args() inside of a if TYPE_CHECKING block, so the code never gets executed. Problem is that the types are not available at runtime.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gandhis1

@Dr-Irv Dr-Irv merged commit df86724 into pandas-dev:main Mar 4, 2023
@gandhis1 gandhis1 deleted the astype branch March 4, 2023 18:07
twoertwein pushed a commit to twoertwein/pandas-stubs that referenced this pull request Apr 1, 2023
…v#562)

* Support all possible time units for Series.astype literals

* Update test to check all timedelta/timestamp dtype args in a loop

* List out all of the individual string literals in astype test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants