Skip to content

BUG: pivot_table with multi-index columns only and margins=True gives wrong output or fails #31016

Closed
@charlesdong1991

Description

@charlesdong1991

Discovered in #31013 , when pivot_table only defines columns with single column and with margins set to True, it will fail.

>>> df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo","bar", "bar", "bar", "bar"],
                    "B": ["one", "one", "one", "two", "two","one", "one", "two", "two"],
                    "C": ["small", "large", "large", "small","small", "large", "small", "small","large"],
                    "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
                    "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
>>> df.pivot_table(columns="A", margins=True, aggfunc=np.mean)
KeyError: 'bar'

However, when defining two columns, it magically works, but return a wrong shape:

>>> df.pivot_table(columns=["A", "B"], margins=True, aggfunc=np.mean)

   A    B  
D  bar  one    4.500000
        two    6.500000
   foo  one    1.666667
        two    3.000000
   All         3.666667
E  bar  one    7.000000
        two    9.000000
   foo  one    3.666667
        two    5.500000
   All         6.000000

This returns a Series, but A, B should be column index, so should get a DataFrame.

Metadata

Metadata

Labels

BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions