Skip to content

Many groupby tests depend on a bug in DataFrameGroupBy.apply #28549

Closed
@christopherzimmerman

Description

@christopherzimmerman

Discovered while fixing an issue in #28541

A large number of tests relating to DataFrameGroupBy.apply depend on a bug in apply where the grouped column is not removed from the returned DataFrame.

Here are some examples:

There are around ~20 cases of this.


In general, they expect the result of something like this:

df = pd.DataFrame({"key": [1, 1, 1, 2, 2, 2, 3, 3, 3], "value": range(9)})
df.groupby('key').apply(lambda x: x.sum())

To include the "key" column in the result, which should not be the case.

     key  value
key
1      3      3
2      6     12
3      9     21

The underlying issue with apply is a straightforward fix, all that needs to be done is change this return to use the _group_selection_context, but the PR will have to include updates to many tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugGroupbyTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions