Skip to content

Unexpected result with Resampler.apply for non-naive time index #25411

Closed
@kdebrab

Description

@kdebrab

When resampling a non-naive time series with a custom function (using apply or agg), an unexpected result is returned.

import pandas as pd

def weighted_quantile(series, weights, q):
    series = series.sort_values()
    cumsum = weights.reindex(series.index).fillna(0).cumsum()
    cutoff = cumsum.iloc[-1] * q
    return series[cumsum >= cutoff].iloc[0]

times = pd.date_range('2017-6-23 18:00', periods=8, freq='15T', tz='UTC')
data = pd.Series([1., 1, 1, 1, 1, 2, 2, 0], index=times)
weights = pd.Series([160., 91, 65, 43, 24, 10, 1, 0], index=times)

data.resample('D').apply(weighted_quantile, weights=weights, q=0.5)
Out[2]: 
2017-06-23 00:00:00+00:00    0.0
Freq: D, dtype: float64

Expected Output

The (single) value of the series should correspond with:

weighted_quantile(data, weights=weights, q=0.5)
Out[3]: 1.0

One indeed gets this result when passing data and weigths with naive time index:

times_naive = pd.date_range('2017-6-23 18:00', periods=8, freq='15T')
data = pd.Series([1., 1, 1, 1, 1, 2, 2, 0], index=times_naive)
weights = pd.Series([160., 91, 65, 43, 24, 10, 1, 0], index=times_naive)

data.resample('D').apply(weighted_quantile, weights=weights, q=0.5)
Out[4]: 
2017-06-23    1.0
Freq: D, dtype: float64

But, as shown above, not when passing non-naive time series.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: 4.2.1
pip: 19.0.2
setuptools: 39.0.1
Cython: 0.29.4
numpy: 1.16.0
scipy: 1.2.0
pyarrow: None
xarray: 0.11.3
IPython: 7.2.0
sphinx: None
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.3.0
bs4: 4.6.1
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapBugDatetimeDatetime data dtypeNeeds TestsUnit test(s) needed to prevent regressionsResampleresample method

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions