Closed
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
index = pd.DatetimeIndex([1450137600000000000, 1474059600000000000], tz='UTC').tz_convert('America/Chicago')
print(index)
df = pd.DataFrame([1, 2], index=index)
print(df.resample('12h', closed='right', label='right').last().ffill())
Problem description
resampling is not handling non-UTC index properly due to daylight saving time change
and the problem occurs in file https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/resample.py
function: _get_time_bins
code: binner = labels = DatetimeIndex(freq=self.freq,...
this problem can be solved by converting ax tz to UTC before the resampling and applying the original tz after DatetimeIndex is created
so the code will look like this
tz = ax.tz
ax = ax.tz_convert('UTC')
if len(ax) == 0:
binner = labels = DatetimeIndex(
data=[], freq=self.freq, name=ax.name)
return binner, [], labels
first, last = ax.min(), ax.max()
first, last = _get_range_edges(first, last, self.freq,
closed=self.closed,
base=self.base)
# GH #12037
# use first/last directly instead of call replace() on them
# because replace() will swallow the nanosecond part
# thus last bin maybe slightly before the end if the end contains
# nanosecond part and lead to `Values falls after last bin` error
binner = labels = DatetimeIndex(freq=self.freq,
start=first,
end=last,
name=ax.name).tz_convert(tz)
this cause the bins to be always aligned by UTC times rather than original tz, but I think that it is adequate behaviour as well.
Expected Output
I expect the resampling to be successful regardless of the time range selected
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 34.0.2
Cython: None
numpy: 1.12.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: 0.7.9.None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None