Skip to content

Fix .loc to handle ambiguity if a single Scalar is first element of a tuple #576

Closed
@Dr-Irv

Description

@Dr-Irv

From discussion in #575

In

IndexType | MaskType | list[HashableT] | Hashable,

Replace Hashable with slice | _IndexSliceTuple | Callable

If you make the suggested change, that will cause the test test_frame.py:test_loc_slice() to fail, but I now realize that the expression used there is ambiguous:

>>> df1 = pd.DataFrame(
...         {"x": [1, 2, 3, 4]},
...         index=pd.MultiIndex.from_product([[1, 2], ["a", "b"]], names=["num", "let"]),
...     )
>>> df1.loc[1, :]
     x
let
a    1
b    2
>>> df2 = pd.DataFrame({"x": [1,2,3,4]}, index=[10, 20, 30, 40])
>>> df2.loc[10, :]
x    1
Name: 10, dtype: int64

So the first argument as an integer could return a DataFrame or Series, dependent on whether the underlying index is a regular Index or MultiIndex

The solution is then to add another overload in _LocIndexerFrame.__getitem__():

    @overload
    def __getitem__(self, idx: tuple[ScalarT, slice]) -> Series | DataFrame: ...

Then modify the test in test_index_slice() to check that the type is Union[pd.Series, pd.DataFrame], and add another test corresponding to df2 above.

Originally posted by @Dr-Irv in #575 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions