Closed
Description
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- (optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
america_chicago = "dateutil//usr/share/zoneinfo/America/Chicago"
transition_1 = pd.Timestamp(year=2013, month=11, day=3, hour=1, minute=0, tz=america_chicago)
transition_2 = pd.Timestamp(year=2013, month=11, day=3, hour=1, minute=0, fold=1, tz=america_chicago)
print(transition_1, transition_2)
print(hash(transition_1))
print(hash(transition_2))
$ python3 -q -X faulthandler hash_bug.py
2013-11-03 01:00:00-05:00 2013-11-03 01:00:00-06:00
780959649129526403
Fatal Python error: Segmentation fault
Current thread 0x00007f908caf1740 (most recent call first):
File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 1814 in _datetime_to_timestamp
File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 717 in _find_last_transition
File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 809 in _resolve_ambiguous_time
File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 739 in _find_ttinfo
File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 828 in utcoffset
File "t.py", line 8 in <module>
[1] 2761996 segmentation fault (core dumped) python3 -q -X faulthandler t.py
Problem description
It should return a correct hash value, and it should not Segfault.
This create issue when using a manipulating a Timestamp with a dictionary or a set.
Expected Output
I would have expected the same behavior than datetime in Python:
import datetime as dt
from dateutil.tz import gettz
america_chicago = gettz("America/Chicago")
transition_1 = dt.datetime(year=2013, month=11, day=3, hour=1, minute=0, tzinfo=america_chicago)
transition_2 = dt.datetime(year=2013, month=11, day=3, hour=1, minute=0, fold=1, tzinfo=america_chicago)
print(transition_1, transition_2)
print(hash(transition_1))
print(hash(transition_2))
2013-11-03 01:00:00-05:00 2013-11-03 01:00:00-06:00
780959649129526403
780959649129526403
So it seems that this bug is coming from pandas.
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit : 1c88e6aff94cc9183909b7c110f554df42509073
python : 3.8.2.final.0
python-bits : 64
OS : Linux
OS-release : 5.5.13-arch2-1
Version : #1 SMP PREEMPT Mon, 30 Mar 2020 20:42:41 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.1.0.dev0+1446.g1c88e6aff
numpy : 1.18.2
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.1.3
Cython : 0.29.16
pytest : 5.4.1
hypothesis : None
sphinx : 3.0.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.16
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None