-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: update DF.set_index #24762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
DOC: update DF.set_index #24762
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
894a080
DOC: update DF.set_index
h-vetinari faf8bcc
oversights
h-vetinari 8401fad
Merge remote-tracking branch 'upstream/master' into set_index_docs
h-vetinari 18597e2
Revert addition of list-likes to df.set_index
h-vetinari 613ebed
Remove dead code
h-vetinari File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4042,12 +4042,16 @@ def set_index(self, keys, drop=True, append=False, inplace=False, | |
Set the DataFrame index using existing columns. | ||
|
||
Set the DataFrame index (row labels) using one or more existing | ||
columns. The index can replace the existing index or expand on it. | ||
columns or arrays (of the correct length). The index can replace the | ||
existing index or expand on it. | ||
|
||
Parameters | ||
---------- | ||
keys : label or list of label | ||
Name or names of the columns that will be used as the index. | ||
keys : label or array-like or list of labels/arrays | ||
This parameter can be either a single column key, a single array of | ||
the same length as the calling DataFrame, or a list containing an | ||
arbitrary combination of column keys and arrays. Here, "array" | ||
encompasses :class:`Series`, :class:`Index` and ``np.ndarray``. | ||
drop : bool, default True | ||
Delete columns to be used as the new index. | ||
append : bool, default False | ||
|
@@ -4092,7 +4096,7 @@ def set_index(self, keys, drop=True, append=False, inplace=False, | |
7 2013 84 | ||
10 2014 31 | ||
|
||
Create a multi-index using columns 'year' and 'month': | ||
Create a MultiIndex using columns 'year' and 'month': | ||
|
||
>>> df.set_index(['year', 'month']) | ||
sale | ||
|
@@ -4102,35 +4106,51 @@ def set_index(self, keys, drop=True, append=False, inplace=False, | |
2013 7 84 | ||
2014 10 31 | ||
|
||
Create a multi-index using a set of values and a column: | ||
Create a MultiIndex using an Index and a column: | ||
|
||
>>> df.set_index([[1, 2, 3, 4], 'year']) | ||
>>> df.set_index([pd.Index([1, 2, 3, 4]), 'year']) | ||
month sale | ||
year | ||
1 2012 1 55 | ||
2 2014 4 40 | ||
3 2013 7 84 | ||
4 2014 10 31 | ||
|
||
Create a MultiIndex using two Series: | ||
|
||
>>> s = pd.Series([1, 2, 3, 4]) | ||
>>> df.set_index([s, s**2]) | ||
month year sale | ||
1 1 1 2012 55 | ||
2 4 4 2014 40 | ||
3 9 7 2013 84 | ||
4 16 10 2014 31 | ||
""" | ||
inplace = validate_bool_kwarg(inplace, 'inplace') | ||
if not isinstance(keys, list): | ||
|
||
err_msg = ('The parameter "keys" may be a column key, one-dimensional ' | ||
'array, or a list containing only valid column keys and ' | ||
'one-dimensional arrays.') | ||
|
||
if (is_scalar(keys) or isinstance(keys, tuple) | ||
or isinstance(keys, (ABCIndexClass, ABCSeries, np.ndarray))): | ||
# make sure we have a container of keys/arrays we can iterate over | ||
# tuples can appear as valid column keys! | ||
keys = [keys] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. strictly speaking, it would be possible to just keep wrapping everything that's not a list into a list, and raise in the for-loop below. But that's a bit hard to grok, and explicit is better than implicit, no? |
||
elif not isinstance(keys, list): | ||
raise ValueError(err_msg) | ||
|
||
missing = [] | ||
for col in keys: | ||
if (is_scalar(col) or isinstance(col, tuple)) and col in self: | ||
# tuples can be both column keys or list-likes | ||
# if they are valid column keys, everything is fine | ||
continue | ||
elif is_scalar(col) and col not in self: | ||
# tuples that are not column keys are considered list-like, | ||
# not considered missing | ||
missing.append(col) | ||
elif (not is_list_like(col, allow_sets=False) | ||
if (is_scalar(col) or isinstance(col, tuple)): | ||
# if col is a valid column key, everything is fine | ||
# tuples are always considered keys, never as list-likes | ||
if col not in self: | ||
missing.append(col) | ||
elif (not isinstance(col, (ABCIndexClass, ABCSeries, | ||
np.ndarray, list)) | ||
or getattr(col, 'ndim', 1) > 1): | ||
raise TypeError('The parameter "keys" may only contain a ' | ||
'combination of valid column keys and ' | ||
'one-dimensional list-likes') | ||
raise ValueError(err_msg) | ||
|
||
if missing: | ||
raise KeyError('{}'.format(missing)) | ||
|
@@ -4163,12 +4183,6 @@ def set_index(self, keys, drop=True, append=False, inplace=False, | |
elif isinstance(col, (list, np.ndarray)): | ||
arrays.append(col) | ||
names.append(None) | ||
elif (is_list_like(col) | ||
and not (isinstance(col, tuple) and col in self)): | ||
# all other list-likes (but avoid valid column keys) | ||
col = list(col) # ensure iterator do not get read twice etc. | ||
arrays.append(col) | ||
names.append(None) | ||
# from here, col can only be a column label | ||
else: | ||
arrays.append(frame[col]._values) | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.