Adding support for indexing a MultiIndex with a DataFrame and/or bi-dimensional np.array

(From #15425 )

Currently, (non-``Multi``)``Index``es can be indexed with ``Series`` indexers. And this actually also applies to ``MultiIndex``es, of which you would be selecting from the first level. Hence, it seems a natural consequence for ``MultiIndex``es to be indexed with ``DataFrame`` indexers.

Moreover, once #15434 is fixed, we will have a bi-dimensional object (``MultiIndex``) which can be indexed with ``np.array``s... but only one-dimensional ones! This is also strange.

The feature per se is certainly useful. As a simple real world example, I am currently working with a ``subjects`` ``DataFrame`` to which I must attribute two columns from ``design``, another ``DataFrame``, depending on a ``group`` and ``time`` columns of ``subjects``, which are also levels of the ``MultiIndex`` of ``design``. I would like to just do

``` python
subjects[design.columns] = design.loc[subjects[["group", "time"]]]
```

Now, I know this could be solved by ``.join``ing the two ``DataFrames``... but this is conceptually more complicated (I even currently ignore whether I can join one ``DataFrame`` on columns and the other on index levels... but this is OT), to the point that I'm rather doing:

```python
to_mi = lambda df : df.set_index(list(df.columns)).index
subjects[design.columns] = design.loc[to_mi(subjects[["group", "time"]])]
```

@jorisvandenbossche [suggests](https://github.com/pandas-dev/pandas/pull/15425#issuecomment-280500783) this feature would add complexity to indexing, _"eg, should the column names align on the level names?"_. I'm personally fine with both answers:

- **Yes**: then we just use something like ``to_mi`` above (transforming a ``DataFrame`` in ``MultiIndex``, and then using it to actually index)
- **No**: then it's really really simple (we just transform the ``DataFrame`` into tuples - I had actually already done this in #15425 before rolling back)

"**Yes**" is probably the cleanest answer (possibly together with allowing indexing with bi-dimensional ``np.array``s, to obtain the equivalent of the "**No**" answer). In any case, once we decide, I can take care of this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adding support for indexing a MultiIndex with a DataFrame and/or bi-dimensional np.array #15438

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Adding support for indexing a MultiIndex with a DataFrame and/or bi-dimensional np.array #15438

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions