Description
Code Sample, a copy-pastable example if possible
In [1]: import numpy as np, pandas as pd
In [2]: dt = pd.DatetimeIndex([pd.Timestamp("2017/01/01")], tz='US/Eastern') # dtype=datetime64[ns, US/Eastern]
In [3]: td_np = np.array([np.timedelta64(1, 'ns')]) # dtype="timedelta64[ns]"
In [4]: (dt + td_np).values # Bad, "applies" TZ twice to get 10:00 instead of 05:00!
Out[4]: array(['2017-01-01T10:00:00.000000001'], dtype='datetime64[ns]')
In [5]: (dt + td_np[0]).values
Out[5]: array(['2017-01-01T05:00:00.000000001'], dtype='datetime64[ns]')
In [6]: (dt + pd.Series(td_np)).values
Out[6]: array(['2017-01-01T05:00:00.000000001'], dtype='datetime64[ns]')
In [7]: (dt.tz_convert(None) + td_np).values
Out[7]: array(['2017-01-01T05:00:00.000000001'], dtype='datetime64[ns]')
In [8]: (dt.values + td_np)
Out[8]: array(['2017-01-01T05:00:00.000000001'], dtype='datetime64[ns]')
In [9]: (dt + pd.TimedeltaIndex(td_np))
TypeError: data type not understood
Problem description
(Above is run on 0.20.3) When adding a tz-aware DatetimeIndex
and a numpy array of timedeltas, the result incorrectly has the timezone applied twice. There are several related issues on tz-aware DatetimeIndex
and timedelta
sums on the issue tracker, for example #14022. However, as far as I've read, those issues appear to have been mostly resolved as of ~0.19; this case just slipped through the cracks. I haven't found any other variants of this bug, except maybe that tz-aware DatetimeIndex + TimedeltaIndex
raises an exception.
(Just for context, I am not relying on this behavior, just happened to come across it while updating some tests to use pandas 0.20.3, but it could lead to someone silently getting incorrect results.)
I think all that needs to be fixed here is to add another case in pd.DatetimeIndex[OpsMixin].__add__
to handle any array-like with dtype=timedelta64[..]
. (Maybe there's a more general upstream solution?) If someone confirms I can submit a patch.
Expected Output
In [4]: (dt + td_np).values
Out[5]: array(['2017-01-01T05:00:00.000000001'], dtype='datetime64[ns]')
Output of pd.show_versions()
pandas: 0.20.3
pytest: 2.9.1
pip: 7.1.0
setuptools: 19.4
Cython: 0.24
numpy: 1.13.1
scipy: 0.17.0
xarray: None
IPython: 5.1.0
sphinx: 1.4.5
patsy: 0.4.1
dateutil: 2.2
pytz: 2015.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 1.5.3
openpyxl: 1.8.6
xlrd: 0.9.3
xlwt: None
xlsxwriter: None
lxml: 3.6.0
bs4: 4.4.0
html5lib: None
sqlalchemy: 1.0.12
pymysql: 0.6.3.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None