Skip to content

ENH: Respect observed=False in groupby for non-categoricals #55261

Open
@rhshadrach

Description

@rhshadrach

With obeserved=True becoming the default in groupby in pandas 3.0, it occurs to me we now have the option to respect observed=False for non-categoricals. For a single grouping the output would be the same, it only makes a difference when there are multiple groupings. In the multiple groupings case, the resulting index of an aggregation is the cartesian product of all the groupings. Transforms and filters are not impacted.

My motivation for this came up in #53521 and I've posted some further details there.

For the implementation, I'm guessing it'd effectively be to wrap the groupings in a Categorical and then just go through the rest of the groupby code normally. There may be some difficulty in this working for grouping by callables. If this doesn't get strong opposition off the bat, I can throw up a proof-of-concept PR just to see what this would look like.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions