Skip to content

Inconsistent result with cumsum columns #32462

Closed
@MathieuDutSik

Description

@MathieuDutSik

I have problem with cumsum and multiple columns

>>> df1 = pd.DataFrame({"A": [2, 1, np.nan, 1, 2, 2, 1],"B": [-8, 2, 3, 1, 5, 6, 7],"C": [3, 5, 6, 5, 4, 4, 3]})
>>> df1.groupby("A").cumsum()
    B   C
0  -8   3
1   2   5
2  -1   6
3   3  10
4  -3   7
5   3  11
6  10  13
>>> df1.groupby("A").cumsum()
    B   C
0  -8   3
1   2   5
2  -1  -1
3   3  10
4  -3   7
5   3  11
6  10  13

The cumsum is computed only on the first column B and the column C is left unchanged. Worse, when recomputing then I get the result I would expect. I could accept either behavior but inconsistent result when iterating seems wrong to me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugGroupbyMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions