Description
# Your code here
import pandas as pd
t = pd.Series([pd.Timestamp("2016-09-01", tz="US/Eastern") + k*pd.Timedelta("00:01:00") for k in xrange(5)])
t.iloc[2] = pd.NaT
print t
print
print t + pd.tseries.offsets.DateOffset(days=1)
# Observed output:
0 2016-09-01 00:00:00-04:00
1 2016-09-01 00:01:00-04:00
2 NaT
3 2016-09-01 00:03:00-04:00
4 2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]
0 2016-09-02 04:00:00-04:00
1 2016-09-02 04:01:00-04:00
2 NaT
3 2016-09-02 04:03:00-04:00
4 2016-09-02 04:04:00-04:00
dtype: datetime64[ns, US/Eastern]
Problem description
Adding a DateOffset object to a series with NaT values makes the output incorrect for the entire series. In the example above, I added one day to the series but it jumps by one day and four hours. This happens for other kinds of DateOffset objects (1 minute, 3 months, business day, etc). The incorrect jump seems to be related to the time zone.
Expected Output
0 2016-09-01 00:00:00-04:00
1 2016-09-01 00:01:00-04:00
2 NaT
3 2016-09-01 00:03:00-04:00
4 2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]
0 2016-09-02 00:00:00-04:00
1 2016-09-02 00:01:00-04:00
2 NaT
3 2016-09-02 00:03:00-04:00
4 2016-09-02 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]
Output of pd.show_versions()
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.1
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: 2.45.0
pandas_datareader: 0.2.1