Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import pytz
from dateutil.tz import gettz
# CET = pytz.timezone("Europe/Paris")
CET = gettz("Europe/Paris")
start = pd.Timestamp("2021-10-31 01:45", tz=CET)
idx = pd.date_range(start, pd.Timestamp("2021-10-31 03:45", tz=CET), freq="30T")
pd.Series(idx, idx)
# 2021-10-31 01:45:00+02:00 2021-10-31 01:45:00+02:00
# 2021-10-31 02:15:00+02:00 2021-10-31 02:15:00+02:00
# 2021-10-31 02:45:00+02:00 2021-10-31 02:45:00+02:00
# 2021-10-31 02:15:00+01:00 2021-10-31 02:15:00+02:00 --> offset of index changes here
# 2021-10-31 02:45:00+01:00 2021-10-31 02:45:00+02:00
# 2021-10-31 03:15:00+01:00 2021-10-31 03:15:00+01:00 --> offset of values changes here
# 2021-10-31 03:45:00+01:00 2021-10-31 03:45:00+01:00
# Freq: 30T, dtype: datetime64[ns, tzfile('/usr/share/zoneinfo/Europe/Paris')]
Issue Description
I noticed unexpected duplicated values when manipulating timestamp ranges with the CET timezone of dateutil
. There is no issue with the same timezone from pytz
.
Expected Behavior
No duplicates in the values of the series here above.
Installed Versions
INSTALLED VERSIONS
commit : bb1f651
python : 3.10.2.final.0
python-bits : 64
OS : Darwin
OS-release : 21.3.0
Version : Darwin Kernel Version 21.3.0: Wed Jan 5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.4.0
numpy : 1.21.0
pytz : 2020.1
dateutil : 2.8.2
pip : 22.0.4
setuptools : 58.1.0
Cython : None
pytest : None
hypothesis : None
...
xarray : None
xlrd : 2.0.1
xlwt : None
zstandard : None