Closed
Description
When applying different functions to columns with a MultiIndex by supplying a mapping to groupby.agg(), the top-level name of the columns get lost.
I believe this is a bug, because the names of the columns are unchanged (the total number of columns might be smaller, if not all columns are in the mapping, though).
In the example here I am using groupby.agg(), even though technically speaking I want to do a transformation. However, groupby.agg() seems to be the only apply-like method that allows the usage of a mapping for different functions per column. What would be the recommended way?
In [2]: df = pd.DataFrame({
...: 'exp' : ['A']*6 + ['B']*6,
...: 'obj' : [1,1,1,2,2,2]*2,
...: 'rep' : [1,2,3] * 4,
...: 'var1' : range(12),
...: 'var2' : range(12,24),
...: 'var3' : range(24,36),
...: })
In [3]: df = df.set_index(['exp', 'obj', 'rep'])
In [4]: df = df.sort_index()
In [5]: df.columns.name = 'vars'
In [6]: print('before unstack: ', df.columns.names)
('before unstack: ', ['vars'])
In [7]: df = df.unstack('rep')
In [8]: print('after unstack: ', df.columns.names)
('after unstack: ', ['vars', 'rep'])
In [9]: funcs = {
...: 'var1' : lambda x: x - x.median(),
...: 'var2' : lambda y: y - y.mean(),
...: 'var3' : lambda y: y - y.sum(),
...: }
In [10]: df1 = df.groupby(level=0).agg(funcs)
In [11]: print('after groupby.agg: ', df1.columns.names)
('after groupby.agg: ', [None, 'rep'])