Skip to content

DEPR: Allowing grouping by an index label when there are duplicates #49434

Open
@rhshadrach

Description

@rhshadrach

From #49373 (comment)

df = pd.DataFrame(
    {
        "a": [1, 1, 2],
        "b": [3, 4, 5],
    },
    index=pd.MultiIndex.from_tuples([(1, 1), (1, 2), (1, 3)], names=["f", "f"]),
)
gb = df.groupby("f")
result = gb.sum()
print(result)
#    a   b
# f       
# 1  4  12

There are two levels with the label "f" here; I think what the user wants in such a case is ambiguous. As such, we should raise instead of using the first level with the given label. In the similar case of grouping by columns, we currently raise.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DeprecateFunctionality to remove in pandasGroupby

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions