Skip to content

BUG: read_csv() silently ignores out-of-range integers #55232

Open
@davidchall

Description

@davidchall

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

from io import StringIO
import pandas as pd

# raises exception: cannot safely cast non-equivalent int32 to uint8
pd.Series([-1, 257], dtype="UInt8")

# no exception raised
data = StringIO("x\n-1\n257")
df = pd.read_csv(data, dtype={"x": "UInt8"})

# unexpected wraparound behavior: -1 -> 255, 257 -> 1
df.x

Issue Description

The read_csv() function no longer raises an exception when it encounters an out-of-range integer. Instead, integer overflow silently exhibits a wraparound behavior.

Expected Behavior

On pandas 1.5.3, pd.read_csv() raises a "cannot cast" exception, which is similar to how this scenario is handled by the pd.Series() constructor. I expect pandas 2.1.1 to continue this behavior.

Installed Versions

python : 3.11.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22621

pandas : 2.1.1
numpy : 1.26.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions