-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: Fix pandas.Series.resample docstring #23197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
2c23296
7940cbe
ff8009e
f71d48f
49fbd6b
b8f8b0e
ad1cada
ceff696
8cd9a89
ce7886a
abb2b58
bbddcd2
e0aef30
9b3c7d7
fab0ad0
7567e12
109e414
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7366,16 +7366,32 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
label=None, convention='start', kind=None, loffset=None, | ||
limit=None, base=0, on=None, level=None): | ||
""" | ||
Resample time-series data. | ||
|
||
Convenience method for frequency conversion and resampling of time | ||
series. Object must have a datetime-like index (DatetimeIndex, | ||
PeriodIndex, or TimedeltaIndex), or pass datetime-like values | ||
to the on or level keyword. | ||
series. Object must have a datetime-like index (``DatetimeIndex``, | ||
``PeriodIndex``, or ``TimedeltaIndex``), or pass datetime-like values | ||
to the ``on`` or ``level`` keyword. | ||
|
||
Parameters | ||
---------- | ||
rule : string | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you use the Python types ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, done. The module is full of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's better to fix one whole docstring at a time, than fixing a problem in all docstrings. Sphinx doesn't care about the types, but we try to be consistent, and will start validating the types soon (not sure if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, atomic improvements around each docstring make total sense. Pushing the amendments now. |
||
the offset string or object representing target conversion | ||
The offset string or object representing target conversion. | ||
how : string | ||
Method for down-/re-sampling, default to ‘mean’ for downsampling. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are the possible values? If they are a fixed set, can we have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Possible values are any valid string function that does some form of aggregation. Interestingly, this would also work with the example dataframe I shared above:
Happy to change the description of the parameter to something like:
I got inspiration from pandas.DataFrame.aggregate. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. f the name needs to be a string with the function name, I assume it can't be any arbitrary value. But as it's deprecated, let's not spend time on this. |
||
|
||
.. deprecated:: 0.18.0 | ||
The new syntax is ``.resample(...).mean()``, or | ||
``.resample(...).apply(<func>)`` | ||
axis : int, optional, default 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you check other docstrings to see how the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another example is:
The problem is that for Maybe a more verbose but informative doc would be
I have added this tentatively to the PR but let me know if something less verbose would be better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I wasn't clear. There is not much standardization of the descriptions, use whatever makes more sense to you. But the used type is somehow standard, that's what I think we should copy from another docstring. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jmrr can you make this change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @datapythonista not your fault at all. I somewhat skipped the word type in your request and I thought you meant how
If we extrapolate this to
I will make this change now, hope it sounds good. |
||
Which index to use for up- or down-sampling. Must be | ||
``DatetimeIndex``, ``TimedeltaIndex`` or ``PeriodIndex``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand the must be. I don't think axis must be DatetimeIndex... Also, don't use backticks around the "along the rows". Backticks are to tell that it's a reference (to a variable, a function, a class...). We don't use them for text. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I took it from the main function description. As In fact, following this logic, For instance:
I hope I'm not missing anything here...
Slip of keyboard, I meant normal textual quotes ("). Changing. |
||
fill_method : string, default None | ||
Filling method for upsampling. | ||
|
||
.. deprecated:: 0.18.0 | ||
The new syntax is ``.resample(...).<func>()``, | ||
e.g. ``.resample(...).pad()`` | ||
closed : {'right', 'left'} | ||
Which side of bin interval is closed. The default is 'left' | ||
for all frequency offsets except for 'M', 'A', 'Q', 'BM', | ||
|
@@ -7385,18 +7401,22 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
for all frequency offsets except for 'M', 'A', 'Q', 'BM', | ||
'BA', 'BQ', and 'W' which all have a default of 'right'. | ||
convention : {'start', 'end', 's', 'e'} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some default values are missing, like in this case. |
||
For PeriodIndex only, controls whether to use the start or end of | ||
`rule` | ||
kind: {'timestamp', 'period'}, optional | ||
For ``PeriodIndex`` only, controls whether to use the start or | ||
end of ``rule``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. single backticks, in both cases, double backticks are for code (including possible values of variables like NaN). But for classes, variables, parameters... we use single backticks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, wasn't clear from the docstring docs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would None be with double backtick in examples like "default There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good question. I think it probably should, but we don't have it anywhere, so let's leave it without backticks in the default for now |
||
kind : {'timestamp', 'period'} optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. missing comma before optional |
||
Pass 'timestamp' to convert the resulting index to a | ||
``DateTimeIndex`` or 'period' to convert it to a ``PeriodIndex``. | ||
By default the input representation is retained. | ||
loffset : timedelta | ||
Adjust the resampled time labels | ||
Adjust the resampled time labels. | ||
limit : int, default None | ||
Maximum size gap when reindexing with ``fill_method``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same |
||
|
||
.. deprecated:: 0.18.0 | ||
base : int, default 0 | ||
For frequencies that evenly subdivide 1 day, the "origin" of the | ||
aggregated intervals. For example, for '5min' frequency, base could | ||
range from 0 through 4. Defaults to 0 | ||
range from 0 through 4. Defaults to 0. | ||
on : string, optional | ||
For a DataFrame, column to use instead of index for resampling. | ||
Column must be datetime-like. | ||
|
@@ -7405,7 +7425,7 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
|
||
level : string or int, optional | ||
For a MultiIndex, level (name or number) to use for | ||
resampling. Level must be datetime-like. | ||
resampling. ``level`` must be datetime-like. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same |
||
|
||
.. versionadded:: 0.19.0 | ||
|
||
|
@@ -7522,9 +7542,10 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
For a Series with a PeriodIndex, the keyword `convention` can be | ||
used to control whether to use the start or end of `rule`. | ||
|
||
>>> s = pd.Series([1, 2], index=pd.period_range('2012-01-01', | ||
freq='A', | ||
periods=2)) | ||
>>> s = pd.Series([1, 2], index=pd.period_range( | ||
... '2012-01-01', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd keep the date in the previous line |
||
... freq='A', | ||
... periods=2)) | ||
>>> s | ||
2012 1 | ||
2013 2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the following examples, when using resample, it once uses
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've created two examples, one sampling a year into quarters using |
||
|
@@ -7577,9 +7598,9 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
|
||
>>> time = pd.date_range('1/1/2000', periods=5, freq='T') | ||
>>> df2 = pd.DataFrame(data=10*[range(4)], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pep8 issue, missing spaces around operator. |
||
columns=['a', 'b', 'c', 'd'], | ||
index=pd.MultiIndex.from_product([time, [1, 2]]) | ||
) | ||
... columns=['a', 'b', 'c', 'd'], | ||
... index=pd.MultiIndex.from_product([time, [1, 2]]) | ||
... ) | ||
>>> df2.resample('3T', level=0).sum() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comments as in the previous example. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, also with a different, more realistic approach. Please let me know your comments. I also struggled a bit with the formatting of long lines, opted for styling lines 7627-7632 using black (also compatible with PEP8) but let me know it breaks consistency and should be reverted to the old style. Pushing now. |
||
a b c d | ||
2000-01-01 00:00:00 0 6 12 18 | ||
jmrr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single backticks.