Skip to content

rolling_mean with freq='D' returns all NaNs when there is exactly 1 data point per day #5955

Closed
@sleibman

Description

@sleibman

related to #3020

$ python
Python 2.7.4 (default, Apr 23 2013, 12:22:04) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> import pandas as pd
>>> pd.__version__
'0.12.0'
>>> indices = [datetime.datetime(1975, 1, i, 12, 0) for i in range(1, 6)]
>>> series = pd.Series(range(1, 6), index=indices)
>>> series = series.map(lambda x: float(x))  # range() returns ints, so force to float
>>> series = series.sort_index()  # already sorted, but just to be clear
>>> series  # here's what our input series looks like
1975-01-01 12:00:00    1
1975-01-02 12:00:00    2
1975-01-03 12:00:00    3
1975-01-04 12:00:00    4
1975-01-05 12:00:00    5
dtype: float64
>>> pd.rolling_mean(series, window=2, freq='D')  # these results will be wrong
1975-01-01   NaN
1975-01-02   NaN
1975-01-03   NaN
1975-01-04   NaN
1975-01-05   NaN
Freq: D, dtype: float64
>>> better_series = series.append(pd.Series([3.0], index=[datetime.datetime(1975, 1, 3, 6, 0)]))
>>> better_series = better_series.sort_index()
>>> better_series  # here's a revised input with more than one datapoint on one of the days
1975-01-01 12:00:00    1
1975-01-02 12:00:00    2
1975-01-03 06:00:00    3
1975-01-03 12:00:00    3
1975-01-04 12:00:00    4
1975-01-05 12:00:00    5
dtype: float64
>>> pd.rolling_mean(better_series, window=2, freq='D')  # These results will be correct and are what I expected above
1975-01-01    NaN
1975-01-02    1.5
1975-01-03    2.5
1975-01-04    3.5
1975-01-05    4.5
Freq: D, dtype: float64

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions