Skip to content

Fix issue with Series.loc[x] where x is a tuple specifiying a specific index to get from a MultiIndex #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 11 additions & 10 deletions pandas-stubs/core/series.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ from pandas.core.indexes.timedeltas import TimedeltaIndex
from pandas.core.indexing import (
_AtIndexer,
_iAtIndexer,
_IndexSliceTuple,
)
from pandas.core.resample import Resampler
from pandas.core.strings import StringMethods
Expand Down Expand Up @@ -131,21 +132,21 @@ class _iLocIndexerSeries(_iLocIndexer, Generic[S1]):
) -> None: ...

class _LocIndexerSeries(_LocIndexer, Generic[S1]):
# ignore needed because of mypy. Overlapping, but we want to distinguish
# having a tuple of just scalars, versus tuples that include slices or Index
@overload
def __getitem__(
def __getitem__( # type: ignore[misc]
self,
idx: MaskType
| Index
| Sequence[float]
| list[str]
| slice
| tuple[str | float | slice | Index, ...],
) -> Series[S1]: ...
idx: Scalar | tuple[Scalar, ...],
# tuple case is for getting a specific element when using a MultiIndex
) -> S1: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we want to distinguish having a tuple of just scalars, versus tuples that include slices or Index

Can you elaborate on which two cases this distunishes?

It might be nice to add inline comments which types/overloads are for index/multiindex.

I found two ways to use a tuple for a normal index, but I don't think this is an expected/typical usage:

> type(pd.Series([1]).__getitem__((0, None)))
numpy.ndarray
> pd.Series([1], index=[("a", "b")]).__getitem__(("a", "b"))
1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a MultiIndex, then you can specify a tuple of index values to get a specific element. You can use a tuple that contains a slice to get more than one value:

>>> s = pd.Series([1,2,3,4], index=pd.MultiIndex.from_product([[1,2], ["a","b"]], names=["nn","ab"]))
>>> s
nn  ab
1   a     1
    b     2
2   a     3
    b     4
dtype: int64
>>> s.loc[1,"b"]
2
>>> s.loc[pd.IndexSlice[1,:]]
ab
a    1
b    2
dtype: int64

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next commit adds comments, and uses _IndexSliceTuple. Note also that s.loc[1,:] works in example above.

@overload
def __getitem__(
self,
idx: str | float,
) -> S1: ...
idx: MaskType | Index | Sequence[float] | list[str] | slice | _IndexSliceTuple,
# _IndexSliceTuple is when having a tuple that includes a slice. Could just
# be s.loc[1, :], or s.loc[pd.IndexSlice[1, :]]
) -> Series[S1]: ...
@overload
def __setitem__(
self,
Expand Down
9 changes: 6 additions & 3 deletions tests/test_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,9 +120,12 @@ def test_types_loc_at() -> None:


def test_multiindex_loc() -> None:
s = pd.Series([1, 2, 3, 4], index=pd.MultiIndex.from_product([[1, 2], ["a", "b"]]))
check(assert_type(s.loc[1, :], pd.Series), pd.Series)
check(assert_type(s.loc[pd.Index([1]), :], pd.Series), pd.Series)
s = pd.Series(
[1, 2, 3, 4], index=pd.MultiIndex.from_product([[1, 2], ["a", "b"]]), dtype=int
)
check(assert_type(s.loc[1, :], "pd.Series[int]"), pd.Series, int)
check(assert_type(s.loc[pd.Index([1]), :], "pd.Series[int]"), pd.Series, int)
check(assert_type(s.loc[1, "a"], int), np.int_)


def test_types_boolean_indexing() -> None:
Expand Down