-
-
Notifications
You must be signed in to change notification settings - Fork 143
Fix issue with Series.loc[x] where x is a tuple specifiying a specific index to get from a MultiIndex #350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def __getitem__( # type: ignore[misc] | ||
self, | ||
idx: Scalar | tuple[Scalar, ...], | ||
) -> S1: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we want to distinguish having a tuple of just scalars, versus tuples that include slices or Index
Can you elaborate on which two cases this distunishes?
It might be nice to add inline comments which types/overloads are for index/multiindex.
I found two ways to use a tuple for a normal index, but I don't think this is an expected/typical usage:
> type(pd.Series([1]).__getitem__((0, None)))
numpy.ndarray
> pd.Series([1], index=[("a", "b")]).__getitem__(("a", "b"))
1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a MultiIndex
, then you can specify a tuple of index values to get a specific element. You can use a tuple that contains a slice to get more than one value:
>>> s = pd.Series([1,2,3,4], index=pd.MultiIndex.from_product([[1,2], ["a","b"]], names=["nn","ab"]))
>>> s
nn ab
1 a 1
b 2
2 a 3
b 4
dtype: int64
>>> s.loc[1,"b"]
2
>>> s.loc[pd.IndexSlice[1,:]]
ab
a 1
b 2
dtype: int64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next commit adds comments, and uses _IndexSliceTuple
. Note also that s.loc[1,:]
works in example above.
Thank you @Dr-Irv ! Do you think it would be helpful to make Series (and DataFrame) generic in regards to their index to help with such overloads? |
We could certainly try. We'd have to track the To some extent, part of the issue here with |
I forgot that for a moment - yes that would create many overlapping overloads (I believe also in pyright). As long as all type checkers pick the first matching overload, it can be fine to use overlapping overloads deliberately. |
Following code failed:
This PR fixes that. Issue is that when a
Series
is backed by aMultiIndex
, you should be able to get specific values by specifying a complete tuple.Added test in
test_series.py:test_multiindex_loc