Description
Pandas version checks
- I have checked that the issue still exists on the latest versions of the docs on
main
here
Location of the documentation
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
Documentation problem
The .loc
API reference contains examples for single conditional lookups but does not include examples of multi-conditional lookups.
Multiple conditional statements in .loc
must be wrapped in parens ( )
and separated by a single &
or |
. This differs from typical python conditional statement syntax and can lead to confusion.
Suggested fix for documentation
Add an example of a multi-conditional lookup after the single conditional example and before the "Callable that returns a boolean Series" example.
>>> df.loc[(df['max_speed'] > 1) & (df['shield'] < 8)]
max_speed shield
viper 4 5
Also, add a note that restructuring a DataFrame into a MultiIndex object for lookup may yield better performance gains over using .loc
with 3 or more conditionals. Prominently link to the MultiIndex user guide in this note and mention that usages of .loc
on MultiIndex objects can be found further down the page.
Rationale
.loc
is a commonly used attribute and its reference page is often the first point of entry into the documentation from a Google search, Stack Overflow article, or other outside source. A relatively new user may miss the user guides because of this entry point (I personally did for months while learning pandas). Providing an example of .loc
usage with multiple conditionals and a prominent link to related user guides will help reduce hours of frustration.
Note
I have already written a draft of this addition to the documentation and plan to assign this issue to myself and make this contribution on the assumption that it is approved.