-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
DOC: DataFrame.ewm clean-up #32212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: DataFrame.ewm clean-up #32212
Conversation
There's more there. I am not familiar with I don't understand this invention. I don't see why using |
You will want to build the docs: If you can post screenshot afterwards would be helpful We use sphinx to build restructured text. Math is a role provided therein: https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#math |
Ready for Review |
Looks like CI is red - can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
Thanks @sursu , this is much nicer!
Just to clarify - in the doc for the respective series method (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.ewm.html), it says
However, Series don't have columns :) But that's a recurring issue in many parts of the documentation, so IMO it's OK to keep it separate from this |
Oh, so Well if someone tries to perform an EW function column-wise on a series they would bump into: ValueError: No axis named columns for object type <class 'pandas.core.series.Series'> which is a clear message. Also, it seems that you can also use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work on this @sursu, nice stuff. Added some comments, most of them are related to the original docstring, and not your changes, but if you don't mind at having a look at them that would be very useful.
Not a problem if you think this should better be addressed in a follow up, or by someone else. Thanks!
pandas/core/window/ewm.py
Outdated
Available EW (exponentially weighted) methods: ``mean()``, ``var()``, ``std()``, | ||
``corr()``, ``cov()``. | ||
|
||
When ``adjust=True`` (default), the EW function is calculated using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have this in the adjust
parameter description instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 cases there which require further details. So I think that is better addressed in the Notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but I don't understand the reasoning. What I have in mind is that if I'm interested in what the adjust
parameter does, I'll go to its description to understand what it does. And the description there is somehow poor, while here he have all the relevant information.
So, this feels misplaced to me. Not sure how the fact that this happens twice in this page changes this reasoning. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, the parameter descriptions are useful when I need to check briefly what parameters may be relevant, what they do, what possible values can they take (via Shift+Tab).
The description needs to be concise, if you want more details about each case, scroll a little below and look at the Notes.
To be able to read the equations there you will have to open the browser. When someone will want to quickly preview the description with Shift+Tab, it will just be cluttered with sphinx signs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand how you use the docs, but in the link I gave you explains what's our convention regarding the sections, so I prefer to follow that, and have the parameters sections for the parameters, and the notes for the implementation notes.
pandas/core/window/ewm.py
Outdated
y_t &= (1 - \alpha) y_{t-1} + \alpha x_t, | ||
\end{split} | ||
|
||
When ``ignore_na=False`` (default), weights are based on absolute positions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this in the ignore_na
one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current parameter description of ignore_na
is fine.
pandas/core/window/ewm.py
Outdated
|
||
More details can be found at: | ||
`Exponentially weighted windows | ||
<https://pandas.pydata.org/pandas-docs/stable/user_guide/computation.html#exponentially-weighted-windows>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sphinx allow you to create links automatically with :ref:
. We should use this, since we may be changing the path of the docs (may be using the domain pandas.io, or using pandas.pydata.org/docs/ instead of pandas-docs/stable...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried that, didn't succeed. If you can instruct me here, I'd be thankful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you grep for :ref:
you can find examples, like:
:ref:`API Changes <whatsnew_0141.api>`
What's inside <>
is a label somewhere in the code, defined as .. _whatsnew_0141.api:
. If you don't have a label for the section you're linking here (it probably exists), you can just create it following the convention of the other labels in that page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to make :ref:
work.
In the same file there is not label defined.
This is my first time working with Sphinx, that's why I'm asking for help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can provide more information that what I said above. With that, and checking other examples, should be quite straight forward to get this working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you tried, but this should be simply:
:ref:`Exponentially weighted windows <stats.moments.exponentially_weighted>`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:ref:`Exponentially weighted windows <stats.moments.exponentially_weighted>`
Does not produce a link. Does not give me an error either.
Could it be that I don't see it as a link because I build only the pandas.DataFrame.ewm
page?
pandas/core/window/ewm.py
Outdated
|
||
More details can be found at | ||
https://pandas.pydata.org/pandas-docs/stable/user_guide/computation.html#exponentially-weighted-windows | ||
Exactly one paramter: ``com``, ``span``, ``halflife``, or ``alpha`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be part of the extended summary, not the notes. And I think the extended summary should provide information on what it does. It's not a blocker to get this merged, but there is a lot of content in this page, but reading it doesn't really explain what this functionality does IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean after "Provide exponential weighted functions."?
I agree that Exactly one parameter: ``com``, ``span``, ``halflife``, or ``alpha`` must be provided.
could be added there. Should I do that and then delete the first paragraph from the notes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct. You have information on the different sections, and what should be in each at: https://pandas.pydata.org/pandas-docs/stable/development/contributing_docstring.html
pandas/core/window/ewm.py
Outdated
Available EW (exponentially weighted) methods: ``mean()``, ``var()``, ``std()``, | ||
``corr()``, ``cov()``. | ||
|
||
When ``adjust=True`` (default), the EW function is calculated using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand how you use the docs, but in the link I gave you explains what's our convention regarding the sections, so I prefer to follow that, and have the parameters sections for the parameters, and the notes for the implementation notes.
pandas/core/window/ewm.py
Outdated
|
||
More details can be found at: | ||
`Exponentially weighted windows | ||
<https://pandas.pydata.org/pandas-docs/stable/user_guide/computation.html#exponentially-weighted-windows>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can provide more information that what I said above. With that, and checking other examples, should be quite straight forward to get this working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sursu, looks great.
Couple of minor things that I think should make reading this easier.
pandas/core/window/ewm.py
Outdated
@@ -52,9 +58,38 @@ class EWM(_Rolling): | |||
Divide by decaying adjustment factor in beginning periods to account | |||
for imbalance in relative weightings | |||
(viewing EWMA as a moving average). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can merge this in the previous line.
pandas/core/window/ewm.py
Outdated
@@ -52,9 +58,38 @@ class EWM(_Rolling): | |||
Divide by decaying adjustment factor in beginning periods to account | |||
for imbalance in relative weightings | |||
(viewing EWMA as a moving average). | |||
When ``adjust=True`` (default), the EW function is calculated using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave a blank line before, so it's more readable in the markdown version. And make the two When adjust=..
a bullet point, so it's easier to read in the html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the problem. I think should be as easy as using something like:
* When adjust=whatever
continuation lines are indented
if one is two long to be more than 88 just add
a line break before it is.
And leave a blank line after the bullet point.
Is there any problem with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/core/window/ewm.py
Outdated
Ignore missing values when calculating weights; specify ``True`` to reproduce | ||
pre-0.15.0 behavior. | ||
|
||
When ``ignore_na=False`` (default), weights are based on absolute positions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same about making them a bullet points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the link is pending, but looks good.
pandas/core/window/ewm.py
Outdated
|
||
More details can be found at: | ||
`Exponentially weighted windows | ||
<https://pandas.pydata.org/pandas-docs/stable/user_guide/computation.html#exponentially-weighted-windows>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you tried, but this should be simply:
:ref:`Exponentially weighted windows <stats.moments.exponentially_weighted>`
You'll have to build all the docs, not that single page to see the link. Also, if you can merge master please, so we make the CI green and we can merge... |
Thanks for the good docstring @sursu |
Cosmetic change.