Skip to content

DOC: update and seperate the Series.drop and Dataframe.drop docstring #20120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 13, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3035,6 +3035,129 @@ def reindex_axis(self, labels, axis=0, method=None, level=None, copy=True,
method=method, level=level, copy=copy,
limit=limit, fill_value=fill_value)

def drop(self, labels=None, axis=0, index=None, columns=None,
level=None, inplace=False, errors='raise'):
"""
Drop rows or columns.

Remove rows or columns by specifying label names and corresponding
axis, or by specifying directly index or column names. When using a
multi-index, labels on different levels can be removed by specifying
the level name or int.

Parameters
----------
labels : single label or list-like
Index or column labels to drop.
axis : int or axis name, default 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfr. gitter discussion: axis : {0 or 'index', 1 or 'columns'}, default 0

Whether to drop labels from the index (0 / 'index') or
columns (1 / 'columns').
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write (0 or 'index') and columns (1 or 'columns')

index : None
Redundant for application on Series, but index can be used instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is now in the data.frame section, this is probably not redundant?

of labels.
columns : None
Redundant for application on Series, but index can be used instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is now in the data.frame section, this is probably not redundant?

of labels.

.. versionadded:: 0.21.0
level : int or level name, optional
For MultiIndex.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to make this more detailed? e.g. Level for which the labels will be removed

inplace : bool, default False
If True, do operation inplace and return None.
errors : {'ignore', 'raise'}, default 'raise'
If 'ignore', suppress error and existing labels are dropped.

Returns
-------
dropped : type of caller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the generic, this was the same as the caller, but here we know it is a dataframe


See Also
--------
DataFrame.dropna : Return DataFrame with labels on given axis omitted
where (all or any) data are missing
DataFrame.drop_duplicates : Return DataFrame with duplicate rows
removed, optionally only considering certain columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference DataFrame.loc (first) as well as this is an indexing operations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also Series.drop


Raises
------
KeyError
If none of the labels are found in the selected axis

Examples
--------
>>> df = pd.DataFrame(np.arange(12).reshape(3,4),
... columns=['A', 'B', 'C', 'D'])
>>> df
A B C D
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11

Drop columns

>>> df.drop(['B', 'C'], axis=1)
A D
0 0 3
1 4 7
2 8 11

>>> df.drop(columns=['B', 'C'])
A D
0 0 3
1 4 7
2 8 11

Drop a row by index

>>> df.drop([0, 1])
A B C D
2 8 9 10 11

Drop columns and/or rows of MultiIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiIndexed DataFrame


>>> midx = pd.MultiIndex(levels=[['lama','cow','falcon'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing spaces after ,

... ['speed','weight','length']],
... labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
... [0, 1, 2, 0, 1, 2, 0, 1, 2]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this example a bit simpler, maybe using integers for the value column (and order them, e.g. use np.arange(len(midx)) to create (big/small)

>>> df = pd.DataFrame(index=midx, columns=['big','small'],
... data=[[45,30],[200,100],[1.5,1],[30,20],
[250,150],[1.5,0.8],[320,250],[1,0.8],
[0.3,0.2]])
>>> df
big small
lama speed 45.0 30.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is layed out correctly, a copy-paste issue?

weight 200.0 100.0
length 1.5 1.0
cow speed 30.0 20.0
weight 250.0 150.0
length 1.5 0.8
falcon speed 320.0 250.0
weight 1.0 0.8
length 0.3 0.2

>>> df.drop(index='cow',columns='small')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space after ,

big
lama speed 45.0
weight 200.0
length 1.5
falcon speed 320.0
weight 1.0
length 0.3

>>> df.drop(index='length', level=1)
big small
lama speed 45.0 30.0
weight 200.0 100.0
cow speed 30.0 20.0
weight 250.0 150.0
falcon speed 320.0 250.0
weight 1.0 0.8
"""
return super(DataFrame,
self).drop(labels=labels, axis=axis, index=index,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indendation is a bit strange here. I would set the self on the same line

columns=columns, level=level, inplace=inplace,
errors=errors)

@rewrite_axis_style_signature('mapper', [('copy', True),
('inplace', False),
('level', None)])
Expand Down
66 changes: 0 additions & 66 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2799,73 +2799,7 @@ def reindex_like(self, other, method=None, copy=True, limit=None,

def drop(self, labels=None, axis=0, index=None, columns=None, level=None,
inplace=False, errors='raise'):
"""
Return new object with labels in requested axis removed.

Parameters
----------
labels : single label or list-like
Index or column labels to drop.
axis : int or axis name
Whether to drop labels from the index (0 / 'index') or
columns (1 / 'columns').
index, columns : single label or list-like
Alternative to specifying `axis` (``labels, axis=1`` is
equivalent to ``columns=labels``).

.. versionadded:: 0.21.0
level : int or level name, default None
For MultiIndex
inplace : bool, default False
If True, do operation inplace and return None.
errors : {'ignore', 'raise'}, default 'raise'
If 'ignore', suppress error and existing labels are dropped.

Returns
-------
dropped : type of caller

Raises
------
KeyError
If none of the labels are found in the selected axis

Examples
--------
>>> df = pd.DataFrame(np.arange(12).reshape(3,4),
columns=['A', 'B', 'C', 'D'])
>>> df
A B C D
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11

Drop columns

>>> df.drop(['B', 'C'], axis=1)
A D
0 0 3
1 4 7
2 8 11

>>> df.drop(columns=['B', 'C'])
A D
0 0 3
1 4 7
2 8 11

Drop a row by index

>>> df.drop([0, 1])
A B C D
2 8 9 10 11

Notes
-----
Specifying both `labels` and `index` or `columns` will raise a
ValueError.

"""
inplace = validate_bool_kwarg(inplace, 'inplace')

if labels is not None:
Expand Down
91 changes: 91 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2660,6 +2660,97 @@ def rename(self, index=None, **kwargs):
def reindex(self, index=None, **kwargs):
return super(Series, self).reindex(index=index, **kwargs)

def drop(self, labels=None, axis=0, index=None, columns=None,
level=None, inplace=False, errors='raise'):
"""
Return Series with specified index labels removed.

Remove elements of a Series based on specifying the index labels.
When using a multi-index, labels on different levels can be removed
by specifying the level name or int.

Parameters
----------
labels : single label or list-like
Index labels to drop.
axis : 0, default 0
Redundant for application on Series.
index : None
Redundant for application on Series, but index can be used instead
of labels.
columns : None
Redundant for application on Series, but index can be used instead
of labels.
.. versionadded:: 0.21.0
level : int or level name, optional
For MultiIndex.
inplace : bool, default False
If True, do operation inplace and return None.
errors : {'ignore', 'raise'}, default 'raise'
If 'ignore', suppress error and existing labels are dropped.

Returns
-------
dropped : type of caller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas.Series


See Also
--------
Series.reindex : Return only specified index labels of Series.
Series.dropna : Return series without null values.
Series.drop_duplicates : Return Series with duplicate values removed.

Raises
------
KeyError
If none of the labels are found in the index.

Examples
--------
>>> s = pd.Series(data=np.arange(3), index=['A','B','C'])
>>> s
A 0
B 1
C 2
dtype: int64

Drop labels B en C

>>> s.drop(labels=['B','C'])
A 0
dtype: int64

Drop 2nd level label in MultiIndex Series

>>> midx = pd.MultiIndex(levels=[['lama','cow','falcon'],
... ['speed','weight','length']],
... labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
... [0, 1, 2, 0, 1, 2, 0, 1, 2]])
>>> s = pd.Series(data=[45,200,1.2,30,250,1.5,320,1,0.3], index=midx)
>>> s
lama speed 45.0
weight 200.0
length 1.2
cow speed 30.0
weight 250.0
length 1.5
falcon speed 320.0
weight 1.0
length 0.3
dtype: float64

>>> s.drop(labels='weight', level=1)
lama speed 45.0
length 1.2
cow speed 30.0
length 1.5
falcon speed 320.0
length 0.3
dtype: float64
"""
return super(Series, self).drop(labels=labels, axis=axis, index=index,
columns=columns, level=level,
inplace=inplace, errors=errors)

@Appender(generic._shared_docs['fillna'] % _shared_doc_kwargs)
def fillna(self, value=None, method=None, axis=None, inplace=False,
limit=None, downcast=None, **kwargs):
Expand Down