Description
(From #15425 )
Currently, (non-Multi
)Index
es can be indexed with Series
indexers. And this actually also applies to MultiIndex
es, of which you would be selecting from the first level. Hence, it seems a natural consequence for MultiIndex
es to be indexed with DataFrame
indexers.
Moreover, once #15434 is fixed, we will have a bi-dimensional object (MultiIndex
) which can be indexed with np.array
s... but only one-dimensional ones! This is also strange.
The feature per se is certainly useful. As a simple real world example, I am currently working with a subjects
DataFrame
to which I must attribute two columns from design
, another DataFrame
, depending on a group
and time
columns of subjects
, which are also levels of the MultiIndex
of design
. I would like to just do
subjects[design.columns] = design.loc[subjects[["group", "time"]]]
Now, I know this could be solved by .join
ing the two DataFrames
... but this is conceptually more complicated (I even currently ignore whether I can join one DataFrame
on columns and the other on index levels... but this is OT), to the point that I'm rather doing:
to_mi = lambda df : df.set_index(list(df.columns)).index
subjects[design.columns] = design.loc[to_mi(subjects[["group", "time"]])]
@jorisvandenbossche suggests this feature would add complexity to indexing, "eg, should the column names align on the level names?". I'm personally fine with both answers:
- Yes: then we just use something like
to_mi
above (transforming aDataFrame
inMultiIndex
, and then using it to actually index) - No: then it's really really simple (we just transform the
DataFrame
into tuples - I had actually already done this in Mi indexing #15425 before rolling back)
"Yes" is probably the cleanest answer (possibly together with allowing indexing with bi-dimensional np.array
s, to obtain the equivalent of the "No" answer). In any case, once we decide, I can take care of this.