Skip to content

Commit f17b96c

Browse files
committed
Update indexing.rst
Add pd_lookup_het() and pd_lookup_hom()
1 parent 6a2da7a commit f17b96c

File tree

1 file changed

+23
-7
lines changed

1 file changed

+23
-7
lines changed

doc/source/user_guide/indexing.rst

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1461,16 +1461,32 @@ Looking up values by index/column labels
14611461

14621462
Sometimes you want to extract a set of values given a sequence of row labels
14631463
and column labels, this can be achieved by ``pandas.factorize`` and NumPy indexing.
1464-
For instance:
1464+
1465+
For heterogeneous column types, we subset columns to avoid unnecessary numpy conversions:
1466+
1467+
.. ipython:: python
1468+
1469+
def pd_lookup_het(df, row_labels, col_labels):
1470+
rows = df.index.get_indexer(row_labels)
1471+
cols = df.columns.get_indexer(col_labels)
1472+
sub = df.take(np.unique(cols), axis=1)
1473+
sub = sub.take(np.unique(rows), axis=0)
1474+
rows = sub.index.get_indexer(row_labels)
1475+
values = sub.melt()["value"]
1476+
cols = sub.columns.get_indexer(col_labels)
1477+
flat_index = rows + cols * len(sub)
1478+
result = values[flat_index]
1479+
return result
1480+
1481+
For homogeneous column types, it is fastest to skip column subsetting and go directly to numpy:
14651482

14661483
.. ipython:: python
14671484
1468-
df = pd.DataFrame({'col': ["A", "A", "B", "B"],
1469-
'A': [80, 23, np.nan, 22],
1470-
'B': [80, 55, 76, 67]})
1471-
df
1472-
idx, cols = pd.factorize(df['col'])
1473-
df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
1485+
def pd_lookup_hom(df, row_labels, col_labels):
1486+
rows = df.index.get_indexer(row_labels)
1487+
cols = df.columns.get_indexer(col_labels)
1488+
result = df.to_numpy()[rows, cols]
1489+
return result
14741490
14751491
Formerly this could be achieved with the dedicated ``DataFrame.lookup`` method
14761492
which was deprecated in version 1.2.0 and removed in version 2.0.0.

0 commit comments

Comments
 (0)