Closed
Description
This appears to be a regression since 0.11 in handling duplicates in MultiIndex columns:
In [11]: df
Out[11]:
h1 main h3 sub h5
0 a A 1 A1 1
1 b B 2 B1 2
2 c B 3 A1 3
3 d A 4 B2 4
4 e A 5 B2 5
5 f B 6 A2 6
In [12]: df2 = df.set_index(['main', 'sub']).T.sort_index(1)
In [13]: df2
Out[13]:
main A B
sub A1 B2 B2 A1 A2 B1
h1 a d e c f b
h3 1 4 5 3 6 2
h5 1 4 5 3 6 2
If we grab out successively we get an unexpected result for the non-duplicate:
In [14]: df2['A']
Out[14]:
sub A1 B2 B2
h1 a d e
h3 1 4 5
h5 1 4 5
In [15]: df2['A']['B2']
Out[15]:
sub B2 B2
h1 d e
h3 4 5
h5 4 5
In [16]: df2['A']['A1'] # this worked in 0.11
Out[16]:
0
0 a
1 1
2 1
In [21]: df2['A']['A1'] # pandas 0.11
Out[21]:
h1 a
h3 1
h5 1
Name: A1, dtype: object
FWIW never like how this can return different a type...