Open
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
I think there are some related issues on this topic, but none that clearly describes the particular use case I have in mind.
Consider data frame P
with multi-index columns a, b, c
and data frame Q
with multi-index columns a, b
.
Given some subset of rows from Q
, I want to be able to use its .index
to subset rows from P
, without a lot of intermediate processing or boilerplate code.
In relational terms, this is a semi-join operation between P and the subset of rows from Q.
Feature Description
Setup:
P = pd.DataFrame(..., index=pd.MultiIndex(..., names=["a", "b", "c"]))
Q = pd.DataFrame(..., index=pd.MultiIndex(..., names=["a", "b"]))
assert not P.index.duplicated.any()
assert not Q.index.duplicated.any()
Q_sub = Q.head()
Desired functionality
Any of these, or something similarly convenient:
P_sub = P.loc[Q_sub.index]
P_sub = P.xs(Q_sub.index)
P_sub = P.xs(Q_sub.index, level=Q_sub.index.names)
P_sub = P.join_semi(Q_sub, how="semi")
Alternative Solutions
Something like this, I think?
P_sub = P.loc[pd.IndexSlice[Q_sub.index.get_level_values("a"), Q_sub.index.get_level_values("b"), :]]
In the case when Q_sub
is exactly 1 row, we can also use .xs
:
P_sub = P.xs(Q_sub.index[0], level=Q_sub.index.names)