Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
##### Case MultiIndex levels = 2
cols = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two')])
ind = pd.DatetimeIndex(start='2017-01-01', freq='15Min', periods=8)
df = pd.DataFrame(np.random.randn(8,2), index=ind, columns=cols)
# Creates an agg dict to map the column to different functions
agg_dict = { col:(np.sum if col[1] == 'one' else np.mean) for col in df.columns }
resampled = df.resample('H').apply(lambda x: agg_dict[x.name](x))
try:
assert isinstance(resampled.columns, pd.MultiIndex)
except AssertionError as e:
e.args += ('Case nlevels={}'.format(df.columns.nlevels),)
raise
##### Case MultiIndex levels = 3
cols = pd.MultiIndex.from_tuples([('A', 'i','one'), ('A', 'ii','two')])
ind = pd.DatetimeIndex(start='2017-01-01', freq='15Min', periods=8)
df = pd.DataFrame(np.random.randn(8,2), index=ind, columns=cols)
# Creates an agg dict to map the column to different functions
agg_dict = { col:(np.sum if col[2] == 'one' else np.mean) for col in df.columns }
resampled = df.resample('H').apply(lambda x: agg_dict[x.name](x))
try:
assert isinstance(resampled.columns, pd.MultiIndex)
except AssertionError as e:
e.args += ('Case nlevels={}'.format(df.columns.nlevels),)
raise
##### Case MultiIndex levels = 4
cols = pd.MultiIndex.from_tuples([('A', 'a', '', 'one'), ('B', 'b', 'i', 'two')])
ind = pd.DatetimeIndex(start='2017-01-01', freq='15Min', periods=8)
df = pd.DataFrame(np.random.randn(8,2), index=ind, columns=cols)
agg_dict = { col:(np.sum if col[3] == 'one' else np.mean) for col in df.columns }
resampled = df.resample('H').apply(lambda x: agg_dict[x.name](x))
try:
assert isinstance(resampled.columns, pd.MultiIndex)
except AssertionError as e:
e.args += ('Case nlevels={}'.format(df.columns.nlevels),)
raise
Problem description
With MultiIndexed columns that have 2 or 3 levels, the resample().apply() does return the same MultiIndexed columns. If you go to 4 levels, returned is a single-level column where only the first level is kept.
In the above code, only the case nlevels=4 raises.
Expected Output
The above code shouldn't raise.
Output of pd.show_versions()
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.3.0
Cython: 0.25.2
numpy: 1.12.0
scipy: 0.18.1
statsmodels: 0.8.0
xarray: 0.9.1
IPython: 5.3.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.2
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.9999999
httplib2: 0.10.3
apiclient: 1.6.2
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: 0.3.0.post