Closed
Description
Discovered in #31013 , when pivot_table
only defines columns
with single column and with margins
set to True
, it will fail.
>>> df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo","bar", "bar", "bar", "bar"],
"B": ["one", "one", "one", "two", "two","one", "one", "two", "two"],
"C": ["small", "large", "large", "small","small", "large", "small", "small","large"],
"D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
"E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
>>> df.pivot_table(columns="A", margins=True, aggfunc=np.mean)
KeyError: 'bar'
However, when defining two columns, it magically works, but return a wrong shape:
>>> df.pivot_table(columns=["A", "B"], margins=True, aggfunc=np.mean)
A B
D bar one 4.500000
two 6.500000
foo one 1.666667
two 3.000000
All 3.666667
E bar one 7.000000
two 9.000000
foo one 3.666667
two 5.500000
All 6.000000
This returns a Series, but A, B should be column index, so should get a DataFrame.