Closed
Description
Partial selection using .xs() & .ix[] on a subset of index levels returns a df with the fixed/selected levels dropped (a very nice feature). However, when you partially select using a tuple on all levels you get a data frame with all indices (levels) returned. It would be nice to have an option to return any/all indices when sub-selecting using subset of levels so there is consistency when you reach the lowest index level.
Perhaps there could be a "drop_fixed_index" parameter option when sub-selecting.
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: print pd.__version__
0.11.0.dev-80945b6
In [4]: # Generate Test DataFrame
...: NUM_ROWS = 100000
...:
In [5]: NUM_COLS = 10
In [6]: col_names = ['A'+num for num in map(str,np.arange(NUM_COLS).tolist())]
In [7]: index_cols = col_names[:5]
In [8]: # Set DataFrame to have 5 level Hierarchical Index & Sort Index!
...: # The dtype does not matter try str or np.int64 same results.
...: df = pd.DataFrame(np.random.randint(5, size=(NUM_ROWS,NUM_COLS)), dtype=np.int64, columns=col_names)
...:
In [9]: df = df.set_index(index_cols).sort_index()
...
In [79]: df
Out[79]: <class 'pandas.core.frame.DataFrame'>
MultiIndex: 100000 entries, (0, 0, 0, 0, 0) to (4, 4, 4, 4, 4)
Data columns:
A5 100000 non-null values
A6 100000 non-null values
A7 100000 non-null values
A8 100000 non-null values
A9 100000 non-null values
dtypes: int64(5)
In [80]: df.ix[(0)]
Out[80]: <class 'pandas.core.frame.DataFrame'>
MultiIndex: 20011 entries, (0, 0, 0, 0) to (4, 4, 4, 4)
Data columns:
A5 20011 non-null values
A6 20011 non-null values
A7 20011 non-null values
A8 20011 non-null values
A9 20011 non-null values
dtypes: int64(5)
In [81]: df.ix[(0,1)]
Out[81]: <class 'pandas.core.frame.DataFrame'>
MultiIndex: 4007 entries, (0, 0, 0) to (4, 4, 4)
Data columns:
A5 4007 non-null values
A6 4007 non-null values
A7 4007 non-null values
A8 4007 non-null values
A9 4007 non-null values
dtypes: int64(5)
In [82]: df.ix[(0,1,2)]
Out[82]: <class 'pandas.core.frame.DataFrame'>
MultiIndex: 817 entries, (0, 0) to (4, 4)
Data columns:
A5 817 non-null values
A6 817 non-null values
A7 817 non-null values
A8 817 non-null values
A9 817 non-null values
dtypes: int64(5)
In [83]: df.ix[(0,1,2,3)]
Out[83]: <class 'pandas.core.frame.DataFrame'>
Int64Index: 162 entries, 0 to 4
Data columns:
A5 162 non-null values
A6 162 non-null values
A7 162 non-null values
A8 162 non-null values
A9 162 non-null values
dtypes: int64(5)
In [84]: df.ix[(0,1,2,3,4)]
Out[84]: A5 A6 A7 A8 A9
A0 A1 A2 A3 A4
0 1 2 3 4 1 2 2 4 2
4 1 4 4 1 0
4 2 1 4 1 3
4 2 4 2 1 1
4 1 1 2 1 4
4 0 0 2 1 1
4 2 0 0 3 1
4 2 2 3 3 1
4 3 0 3 4 1
4 1 1 0 0 1
4 2 1 0 2 4
4 3 4 1 2 3
4 0 4 3 1 0
4 4 1 4 1 2
4 1 3 4 3 3
4 0 1 1 3 1
4 2 2 2 0 3
4 0 0 1 4 0
4 1 0 1 4 2
4 1 4 2 2 0
4 4 2 0 3 1
4 2 1 2 3 2
4 4 2 0 1 4
4 1 4 1 1 4
4 1 0 1 2 4
4 2 3 0 1 3
4 2 1 3 3 3
4 1 2 0 4 2
4 3 0 4 4 0
4 4 4 2 3 0
4 0 0 1 3 2
4 4 0 0 0 3
4 2 0 3 4 2
4 3 3 3 0 2
4 4 2 2 0 1
4 2 1 3 4 0
In [86]: df.index.lexsort_depth
Out[86]: 5