-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Improved documentation for DataFrame.join #12193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4318,18 +4318,20 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |
Series is passed, its name attribute must be set, and that will be | ||
used as the column name in the resulting joined DataFrame | ||
on : column name, tuple/list of column names, or array-like | ||
Column(s) to use for joining, otherwise join on index. If multiples | ||
Column(s) in the caller to join on the index in other, | ||
otherwise joins index-on-index. If multiples | ||
columns given, the passed DataFrame must have a MultiIndex. Can | ||
pass an array as the join key if not already contained in the | ||
calling DataFrame. Like an Excel VLOOKUP operation | ||
how : {'left', 'right', 'outer', 'inner'} | ||
How to handle indexes of the two objects. Default: 'left' | ||
for joining on index, None otherwise | ||
|
||
* left: use calling frame's index | ||
* right: use input frame's index | ||
* outer: form union of indexes | ||
* inner: use intersection of indexes | ||
How to handle the operation of the two objects. Default: 'left' | ||
|
||
* left: use calling frame's index (or column if on is specified) | ||
* right: use other frame's index | ||
* outer: form union of calling frame's index (or column if on is | ||
specified) with other frame's index | ||
* inner: form intersection of calling frame's index (or column if | ||
on is specified) with other frame's index | ||
lsuffix : string | ||
Suffix to use from left frame's overlapping columns | ||
rsuffix : string | ||
|
@@ -4343,6 +4345,46 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |
on, lsuffix, and rsuffix options are not supported when passing a list | ||
of DataFrame objects | ||
|
||
Examples | ||
-------- | ||
>>> caller = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'], | ||
... 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']}) | ||
|
||
>>> caller | ||
A key | ||
0 A0 K0 | ||
1 A1 K1 | ||
2 A2 K2 | ||
3 A3 K3 | ||
4 A4 K4 | ||
5 A5 K5 | ||
|
||
>>> other = pd.DataFrame({'key': ['K0', 'K1', 'K2'], | ||
... 'B': ['B0', 'B1', 'B2']}) | ||
|
||
>>> other | ||
B key | ||
0 B0 K0 | ||
1 B1 K1 | ||
2 B2 K2 | ||
|
||
Perform a left join using caller's key column and other frame's index | ||
|
||
>>> caller.join(other.set_index('key'), on='key', how='left', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add this same example w/o using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just add this example? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback I don't think it is possible to not use @edublancas the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about having just these two examples: Perform a left join using caller's key column and other frame's index
Set
|
||
... lsuffix='_l', rsuffix='_r') | ||
|
||
>>> A key B | ||
0 A0 K0 B0 | ||
1 A1 K1 B1 | ||
2 A2 K2 B2 | ||
3 A3 K3 NaN | ||
4 A4 K4 NaN | ||
5 A5 K5 NaN | ||
|
||
See also | ||
-------- | ||
DataFrame.merge : For column(s)-on-columns(s) operations | ||
|
||
Returns | ||
------- | ||
joined : DataFrame | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you shorten this?
I think the " joins on indexes by default" is very useful explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the shorter explanation is better: