-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Improved documentation for DataFrame.join #12193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* left: use calling frame's index or column(s) | ||
* right: use other frame's index | ||
* outer: form union of calling frame's index or column(s) with | ||
other frame's index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is more confusing, as this depends if on
is specified or not (so you can simply say that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this?
* left: use calling frame's index (or column if on is specified)
* right: use other frame's index
* outer: form union of calling frame's index (or column if on is specified) with other frame's index
can you update |
DOC: improves DataFrame.join documentation
Done, sorry for the delay. |
|
||
Perform a left join using caller's key column and other frame's index | ||
|
||
>>> caller.join(other.set_index('key'), on='key', how='left', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add this same example w/o using .set_index
as well. (and w/o on
), and indicate the difference between them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just add this example? caller.join(other, how='left', lsuffix='_l', rsuffix='_r')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I don't think it is possible to not use set_index
, as join
always uses the index of other
(which is actually really confusing ...)
@edublancas the lsuffix='_l', rsuffix='_r'
is redundant in this case, so I would leave it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about having just these two examples:
Perform a left join using caller's key column and other frame's index
caller.join(other.set_index('key'), on='key', how='left')
Set key
as the index column on caller
and other
, then perform an index-on-index join.
caller.set_index('key').join(other.set_index('key'), how='left')
pls rebase/update |
Sorry for the delay, I've been working to meet some deadlines for a project. I'll update in the next few days. |
can you rebase/update |
index-on-index and index-on-column(s) joins, but *joins on indexes* by default | ||
rather than trying to join on common columns (the default behavior for | ||
``merge``). If you are joining on index, you may wish to use ``DataFrame.join`` | ||
to save yourself some typing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you shorten this?
I think the " joins on indexes by default" is very useful explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the shorter explanation is better:
index-on-index (by default) and column(s)-on-index join. If you are joining on index only, you may wish to use
DataFrame.join
to save yourself some typing.
can you update according to comments |
can you rebase / update? |
I think the documentation is clear now. There are 3 examples, one using the dataframes original indexes and two joining using the key columns, the first one setting key as the index in both and the second one using on. |
thanks @edublancas nice improvement! |
closes #12188
I modified the description in DataFrame.join to make clear the difference with DataFrame.merge, also added examples.