Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
>>> import pandas as pd
>>> pd.__version__
'2.1.4'
>>> df = pd.DataFrame({"time_col": ['11:41:43.076160']})
>>> df.to_json("/tmp/test.json")
>>> open("/tmp/test.json").read()
'{"time_col":{"0":"11:41:43.076160"}}'
>>> pd.read_json("/tmp/test.json")
time_col
0 11:41:43.076160
>>> pd.read_json("/tmp/test.json", dtype={"time_col": "time64[us][pyarrow]"})
...
pyarrow.lib.ArrowNotImplementedError: Unsupported cast from string to time64 using function cast_time64
Issue Description
Below stacktrace originates from bigframes test suite running with pandas 2.1.4. It passes with pandas 2.1.3.
> df = session.read_json(
read_path,
# Convert default pandas dtypes to match BigQuery DataFrames dtypes.
dtype=dtype,
lines=True,
orient="records",
)
tests/system/small/test_session.py:984:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
bigframes/session/__init__.py:1044: in read_json
pandas_df = pandas.read_json( # type: ignore
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:808: in read_json
return json_reader.read()
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1016: in read
obj = self._get_object_parser(self._combine_lines(data_lines))
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1044: in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1183: in parse
self._try_convert_types()
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1445: in _try_convert_types
self._process_converter(
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1427: in _process_converter
new_data, result = f(col, c)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1446: in <lambda>
lambda col, c: self._try_convert_data(col, c, convert_dates=False)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/io/json/_json.py:1243: in _try_convert_data
return data.astype(dtype), True
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/generic.py:6668: in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/internals/managers.py:431: in astype
return self.apply(
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/internals/managers.py:364: in apply
applied = getattr(b, f)(**kwargs)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/internals/blocks.py:754: in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/dtypes/astype.py:237: in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/dtypes/astype.py:182: in astype_array
values = _astype_nansafe(values, dtype, copy=copy)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/dtypes/astype.py:80: in _astype_nansafe
return dtype.construct_array_type()._from_sequence(arr, dtype=dtype, copy=copy)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/arrays/arrow/array.py:277: in _from_sequence
pa_array = cls._box_pa_array(scalars, pa_type=pa_type, copy=copy)
.nox/system_prerelease/lib/python3.11/site-packages/pandas/core/arrays/arrow/array.py:491: in _box_pa_array
pa_array = pa_array.cast(pa_type)
pyarrow/array.pxi:997: in pyarrow.lib.Array.cast
???
.nox/system_prerelease/lib/python3.11/site-packages/pyarrow/compute.py:404: in cast
return call_function("cast", [arr], options, memory_pool)
pyarrow/_compute.pyx:590: in pyarrow._compute.call_function
???
pyarrow/_compute.pyx:385: in pyarrow._compute.Function.call
???
pyarrow/error.pxi:154: in pyarrow.lib.pyarrow_internal_check_status
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E pyarrow.lib.ArrowNotImplementedError: Unsupported cast from string to time64 using function cast_time64
pyarrow/error.pxi:91: ArrowNotImplementedError
Expected Behavior
Test should continue to pass while upgrading pandas from 2.1.3 to 2.1.4.
Installed Versions
$ python -c "import pandas as pd; pd.show_versions()"
INSTALLED VERSIONS
commit : a671b5a
python : 3.11.4.final.0
python-bits : 64
OS : Linux
OS-release : 6.5.6-1rodete4-amd64
Version : #1 SMP PREEMPT_DYNAMIC Debian 6.5.6-1rodete4 (2023-10-24)
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.1.4
numpy : 1.26.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : None
pytest : 7.4.3
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.18.1
pandas_datareader : None
bs4 : None
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : 2023.12.1
gcsfs : 2023.12.1
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : 0.19.2
pyarrow : 15.0.0.dev247
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : 2.0.23
tables : None
tabulate : 0.9.0
xarray : 2023.12.0
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None