Closed
Description
related to #3020
$ python
Python 2.7.4 (default, Apr 23 2013, 12:22:04)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> import pandas as pd
>>> pd.__version__
'0.12.0'
>>> indices = [datetime.datetime(1975, 1, i, 12, 0) for i in range(1, 6)]
>>> series = pd.Series(range(1, 6), index=indices)
>>> series = series.map(lambda x: float(x)) # range() returns ints, so force to float
>>> series = series.sort_index() # already sorted, but just to be clear
>>> series # here's what our input series looks like
1975-01-01 12:00:00 1
1975-01-02 12:00:00 2
1975-01-03 12:00:00 3
1975-01-04 12:00:00 4
1975-01-05 12:00:00 5
dtype: float64
>>> pd.rolling_mean(series, window=2, freq='D') # these results will be wrong
1975-01-01 NaN
1975-01-02 NaN
1975-01-03 NaN
1975-01-04 NaN
1975-01-05 NaN
Freq: D, dtype: float64
>>> better_series = series.append(pd.Series([3.0], index=[datetime.datetime(1975, 1, 3, 6, 0)]))
>>> better_series = better_series.sort_index()
>>> better_series # here's a revised input with more than one datapoint on one of the days
1975-01-01 12:00:00 1
1975-01-02 12:00:00 2
1975-01-03 06:00:00 3
1975-01-03 12:00:00 3
1975-01-04 12:00:00 4
1975-01-05 12:00:00 5
dtype: float64
>>> pd.rolling_mean(better_series, window=2, freq='D') # These results will be correct and are what I expected above
1975-01-01 NaN
1975-01-02 1.5
1975-01-03 2.5
1975-01-04 3.5
1975-01-05 4.5
Freq: D, dtype: float64