Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
from datetime import datetime
from pandas import DataFrame
# typo in the data entry, should have been 2020!
df = DataFrame({'x': [datetime(2920, 10, 1)]})
# let's try to fix that in code:
df.x.replace({datetime(2920, 10, 1): datetime(2020, 10, 1)})
# or
df.replace({datetime(2920, 10, 1): datetime(2020, 10, 1)})
Raises:
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2920-10-01 00:00:00
Full traceback details
OutOfBoundsDatetime Traceback (most recent call last)
<ipython-input-1-a4bdb67c8945> in <module>
5 df = DataFrame({'x': [datetime(2920, 10, 1)]})
6 # let's try to fix that in code:
----> 7 df.x.replace({datetime(2920, 10, 1): datetime(2020, 10, 1)})
/site-packages/pandas/core/series.py in replace(self, to_replace, value, inplace, limit, regex, method)
4561 method="pad",
4562 ):
-> 4563 return super().replace(
4564 to_replace=to_replace,
4565 value=value,
/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
6495 to_replace, value = keys, values
6496
-> 6497 return self.replace(
6498 to_replace, value, inplace=inplace, limit=limit, regex=regex
6499 )
/site-packages/pandas/core/series.py in replace(self, to_replace, value, inplace, limit, regex, method)
4561 method="pad",
4562 ):
-> 4563 return super().replace(
4564 to_replace=to_replace,
4565 value=value,
/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
6538 )
6539 self._consolidate_inplace()
-> 6540 new_data = self._mgr.replace_list(
6541 src_list=to_replace,
6542 dest_list=value,
/site-packages/pandas/core/internals/managers.py in replace_list(self, src_list, dest_list, inplace, regex)
640 mask = ~isna(values)
641
--> 642 masks = [comp(s, mask, regex) for s in src_list]
643
644 result_blocks = []
/site-packages/pandas/core/internals/managers.py in <listcomp>(.0)
640 mask = ~isna(values)
641
--> 642 masks = [comp(s, mask, regex) for s in src_list]
643
644 result_blocks = []
/site-packages/pandas/core/internals/managers.py in comp(s, mask, regex)
633 return ~mask
634
--> 635 s = com.maybe_box_datetimelike(s)
636 return _compare_or_regex_search(values, s, regex, mask)
637
/site-packages/pandas/core/common.py in maybe_box_datetimelike(value, dtype)
88
89 if isinstance(value, (np.datetime64, datetime)):
---> 90 value = tslibs.Timestamp(value)
91 elif isinstance(value, (np.timedelta64, timedelta)):
92 value = tslibs.Timedelta(value)
pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()
pandas/_libs/tslibs/np_datetime.pyx in pandas._libs.tslibs.np_datetime.check_dts_bounds()
Problem description
It is useful to be able to replace the dates which are out of bound, because these are not supported by pandas. However, it is currently difficult because replace no longer works for them. This is a regression as the above code worked well in pandas 1.0.4, but does not work in pandas 1.1.2 nor on master.
Expected Output
df
should be equal to DataFrame({'x': [datetime(2020, 10, 1)]})
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 2a7d332
python : 3.8.1.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-48-generic
Version : #52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8
pandas : 1.1.2
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.2.3
setuptools : 41.2.0
Cython : None
pytest : 5.3.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.1.2
numexpr : None
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
numba : 0.49.0