Description
Code Sample, a copy-pastable example if possible
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0, 11, (10, 3)), columns=['num1', 'num2', 'num3'])
df['category_tuple'] = [(0, 1), (0, 1), (0, 1), (2, 3), (2, 3), (2, 3), (2, 3), (0, 4), (5,), (6,)]
df['category_string'] = list('aaabbbbcde')
df = df[['category_tuple', 'category_string', 'num1', 'num2', 'num3']]
# df.groupby('category_string').apply(lambda x: x.sort_values(by='num1'))
df.groupby('category_tuple').apply(lambda x: x.sort_values(by='num1'))
Problem description
I get this error: ValueError: Names should be list-like for a MultiIndex
This behaviour changed somewhere between 0.19.0 and 0.19.1. This can be sidestepped by casting the tuples to a string before doing the groupby, but the behaviour is definitely unexpected.
category_string
works, but category_tuple
does not. Other operations also break in the same way (e.g., describe()
). I'd expect them both to work in the same way, and not create a multiindex unless I request it.
Expected Output
category_tuple category_string num1 num2 num3
category_tuple
(0, 1) 0 (0, 1) a 0 0 10
2 (0, 1) a 0 7 1
1 (0, 1) a 6 7 2
(2, 3) 4 (2, 3) b 1 3 0
6 (2, 3) b 3 6 8
3 (2, 3) b 6 10 3
5 (2, 3) b 6 0 1
(0, 4) 7 (0, 4) c 7 7 5
(5,) 8 (5,) d 4 5 7
(6,) 9 (6,) e 9 1 5
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
pandas: 0.22.0
pytest: None
pip: 10.0.0
setuptools: 28.8.0
Cython: None
numpy: 1.14.2
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 6.3.1
sphinx: None
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None