Open
Description
In 0.15.2 (and I believe this remains the case), the docstring for DataFrame.unstack()
states The level involved will automatically get sorted.
. This is not necessarily the case when level
is a list of levels.
In [40]: df = pd.DataFrame(np.arange(8).reshape((4, 2)),
index=pd.MultiIndex.from_tuples([(100, 'A', 'y'), (100, 'A', 'x'),
(100, 'B', 'x'), (200, 'B', 'y')],
names=['Nums', 'Upper','Lower']))
In [41]: df
Out[41]:
0 1
Nums Upper Lower
100 A y 0 1
x 2 3
B x 4 5
200 B y 6 7
In [42]: df.unstack([1, 2])
Out[42]:
0 1
Upper A B A B
Lower y x x y y x x y
Nums
100 0 2 4 NaN 1 3 5 NaN
200 NaN NaN NaN 6 NaN NaN NaN 7
Note that the pivoted tuples are ordered as [(A, y), (A, x), (B, x), (B, y)]
, which is not sorted.
I would expect the result to be the same as the following:
In [43]: df.T.stack([1, 2]).T
Out[43]:
0 1
Upper A B A B
Lower x y x y x y x y
Nums
100 2 0 4 NaN 3 1 5 NaN
200 NaN NaN NaN 6 NaN NaN NaN 7
In fact, there seems to be a problem even when level
is a list containing just a single level. Compare the following:
In [47]: df.unstack(2)
Out[47]:
0 1
Lower x y x y
Nums Upper
100 A 2 0 3 1
B 4 NaN 5 NaN
200 B NaN 6 NaN 7
In [48]: df.unstack([2])
Out[48]:
0 1
Lower y x y x
Nums Upper
100 A 0 2 1 3
B NaN 4 NaN 5
200 B 6 NaN 7 NaN
In [50]: df.T.stack(2).T
Out[50]:
0 1
Lower x y x y
Nums Upper
100 A 2 0 3 1
B 4 NaN 5 NaN
200 B NaN 6 NaN 7
In [51]: df.T.stack([2]).T
Out[51]:
0 1
Lower x y x y
Nums Upper
100 A 2 0 3 1
B 4 NaN 5 NaN
200 B NaN 6 NaN 7