Closed
Description
When I use pd.Series.tolist as a reducer with a single column groupby, it works.
When I do the same with multiindex, it does not.
It seems the "fast" cython groupby function, which has no quarrel with reducing into lists, throws an exception if the index is "complex", which seem to mean multiindex. When that exception is caught, the groupby function falls back to the "pure_python" groupby, which throws a new exception if the reducing function returns a list.
Is this a bug or is there some logic to this which is not apparent to me?
Reproduce:
import pandas as pd
s1 = pd.Series(randn(5), index=['a', 'b', 'c', 'd', 'e'])
df = pd.DataFrame([s1], columns=['a', 'b', 'c', 'd', 'e'])
for i in range(0,10):
s1 = pd.Series(randn(5), index=['a', 'b', 'c', 'd', 'e'])
df2 = pd.DataFrame([s1], columns=['a', 'b', 'c', 'd', 'e'])
df = pd.concat([df, df2])
df['gk'] = 'foo'
df['gk2'] = 'bar'
# This works.
df.groupby(['gk']).agg(pd.Series.tolist)
# This does not.
df.groupby(['gk', 'gk2']).agg(pd.Series.tolist)