Skip to content

to_numeric not correctly converting decimal type to float #21551

Closed
@MikeWoodward

Description

@MikeWoodward

Code Sample, a copy-pastable example if possible

import pandas as pd
import decimal 

# Set up date
data = {'line': [1,2,3,4],
        'decimal': [12345678.1, 12345678.01, 12345678.001, 12345678.0001],
        'original': [12345678.1, 12345678.01, 12345678.001, 12345678.0001]}
df = pd.DataFrame(data)

df.to_csv('df.csv', index=False)

# Convert 'decimal' to decimal type
df = pd.read_csv('df.csv', converters={'decimal': decimal.Decimal})

df['float_n'] = pd.to_numeric(df['decimal'], downcast='float', errors='coerce')
df['float_a'] = df['decimal'].astype(float)

for index, row in df.iterrows():
    print(row['float_n'], row['float_a'])

Problem description

I'm using to_numeric to convert the results of a database query from a decimal type to a float. I've found that to_numeric rounds the result in some cases giving an incorrect answer. Converting the data using astype correctly converts the data.

I can't reproduce my example here because it uses a database call and it's a work project, but I have created sample code that shows the same behavior.

Here's the output from my example:

12345678.0 12345678.1
12345678.0 12345678.01
12345678.0 12345678.001
12345678.0 12345678.0001

The first column is the result of the to_numeric conversion, the second column is the result of the astype(float) conversion. These columns should contain the same values, but they don't. to_numeric is rounding the results and astype isn't.

Expected Output

12345678.1 12345678.1
12345678.01 12345678.01
12345678.001 12345678.001
12345678.0001 12345678.0001

In other words, to_numeric and astype should give the same result - to_numeric should not round the data.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Darwin
OS-release: 17.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.22.0
pytest: 3.3.2
pip: 10.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsDtype ConversionsUnexpected or buggy dtype conversionsNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions