Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue 1
import pyarrow as pa
array = pa.array([1.5, 2.5], type=pa.float64())
array.to_pandas(types_mapper={pa.float64(): pa.int64()}.get)
ArrowInvalid: Float value 1.5 was truncated converting to int64
Issue 2
import pandas as pd
import pyarrow as pa
from decimal import Decimal
df = pd.DataFrame({"a": [Decimal("123.00")]}, dtype="string[pyarrow]")
df.to_parquet("decimal.pq", schema=pa.schema([("a", pa.decimal128(5))]))
result = pd.read_parquet("decimal.pq")
expected = pd.DataFrame({"a": ["123"]}, dtype="string[python]")
pd.testing.assert_frame_equal(result, expected)
AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="a") are different
Attribute "dtype" are different
[left]: object
[right]: string[python]
Issue Description
Two issues have been observed when using pandas 2.2.3 with pyarrow >= 18.0.0:
-
Test cases Failing : pandas/tests/extension/test_arrow.py::test_from_arrow_respecting_given_dtype_unsafe and pandas/tests/io/test_parquet.py::TestParquetPyArrow::test_roundtrip_decimal
-
Stricter float-to-int casting causes ArrowInvalid in tests like test_from_arrow_respecting_given_dtype_unsafe.
-
Decimal roundtrip mismatch: test_roundtrip_decimal fails due to dtype mismatches (object vs. string[python]) when reading back a decimal column written with a specified pyarrow schema.
These issues were not present with pyarrow==17.x.
Expected Behavior
-
Float to int casting should either handle truncation more gracefully (as in older versions) or tests should be updated to skip/adjust.
-
Decimal roundtrips to parquet should maintain the same pandas dtype or document clearly if type coercion is expected.
Installed Versions
python : 3.11.11
pandas : 2.2.3
pyarrow : 19.0.1