Description
I find the docs around MultiIndex slicers to be quite confusing. It implies the MultiIndex needs to be lexsorted, and introduces the sortlevel() function but then has a caveat that this doesn't actually ensure sortedness.
There's some more details of my explorations and questions on StackOverflow:
http://stackoverflow.com/questions/31427466/ensuring-lexicographical-sort-in-pandas-multiindex
I'd like either a simple one-liner to ensure lexsortedness, more reassurance that the usual ways to create a MultiIndex will lexsort the labels in each level, or some more elaboration in the docs about exactly what the issues will be with indexing and slicing if the labels are not lexsorted.
Does my example show a bug in is_lexsorted too? I would expect sorted2.is_lexsorted() to be false here, as 'col1' is not lexsorted.
In [8]:
sorted2 = df3.sortlevel()
sorted2
Out[8]:
data
col1 col2
b 1 three
3 one
d 1 two
a 2 four
In [9]: sorted2.index.is_lexsorted()
Out[9]: True
In [10]: sorted2.index
Out[10]:
MultiIndex(levels=[[u'b', u'd', u'a'], [1, 2, 3]],
labels=[[0, 0, 1, 2], [0, 2, 0, 1]],
names=[u'col1', u'col2'])