Skip to content

pd.to_numeric(..., errors="coerce") failing silently when strings contain "uint64" #32394

@buhrmann

Description

@buhrmann

Problem description

When trying to coerce strings to numeric values using to_numeric(), the occurrence of the substring "uint64" (but not any other dtype-like substring it seems) leads to silent failure to coerce.

strs = ["32", "64", "uint32", "float64", "sdnfonsdf uint32 knsdf", "sdnfonsdf uint64 knsdf", "uint64"]
print([pd.to_numeric(s, errors="coerce") for s in strs])
pd.to_numeric(pd.Series(["32", "64", "uint64"]), errors="coerce")
[32, 64, nan, nan, nan, 'sdnfonsdf uint64 knsdf', 'uint64']

0        32
1        64
2    uint64
dtype: object

Expected Output

[32, 64, nan, nan, nan, nan, nan]

0        32.0
1        64.0
2        NaN
dtype: float64

Seems to fail equally in 0.25.3 and 1.0...

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None

pandas : 0.25.3
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200119
Cython : None
pytest : 5.3.4
hypothesis : None
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : None
pymysql : 0.9.3
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.12
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

Metadata

Metadata

Labels

Numeric OperationsArithmetic, Comparison, and Logical operations

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions