Skip to content

BUG: convert_dtypes() doesn't convert after a previous conversion was done #58543

Closed
@caballerofelipe

Description

@caballerofelipe

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

Step 1

df = pd.DataFrame({'column': [0.0, 1.0, 2.0, 3.3]})
df

Returns

  column
0 0.0
1 1.0
2 2.0
3 3.3

Step 2

df.dtypes

Returns

column    float64
dtype: object

Step 3

df = df.convert_dtypes()

Step 4

df.dtypes

Returns

column    Float64
dtype: object

Step 5

# Select only rows without a decimal part
newdf = df.iloc[:-1]
newdf

Returns

	column
0	0.0
1	1.0
2	2.0

Step 6

newdf.convert_dtypes()

Returns

	column
0	0.0
1	1.0
2	2.0

Step 7

newdf.convert_dtypes().dtypes

Returns

column    Float64
dtype: object

Issue Description

When having a column in a DataFrame with decimal numbers and using convert_dtypes, the type for that column is correctly transformed from float64 to Float64 (capital F).

However, intuitively, when removing the numbers that have a decimal part and running again convert_dtypes, this functions should convert to Int64 (capital I) instead of keeping Float64.

Expected Behavior

convert_dtypes should convert from Float64 to Int64 if the numbers in the column don't have a decimal part.

Installed Versions

INSTALLED VERSIONS
------------------
commit                : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140
python                : 3.11.7.final.0
python-bits           : 64
OS                    : Darwin
OS-release            : 23.4.0
Version               : Darwin Kernel Version 23.4.0: Fri Mar 15 00:10:42 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6000
machine               : arm64
processor             : arm
byteorder             : little
LC_ALL                : None
LANG                  : None
LOCALE                : None.UTF-8

pandas                : 2.2.2
numpy                 : 1.26.4
pytz                  : 2024.1
dateutil              : 2.9.0
setuptools            : 69.5.1
pip                   : 24.0
Cython                : 3.0.8
pytest                : 8.0.0
hypothesis            : None
sphinx                : None
blosc                 : None
feather               : None
xlsxwriter            : 3.1.9
lxml.etree            : 5.1.0
html5lib              : 1.1
pymysql               : None
psycopg2              : None
jinja2                : 3.1.3
IPython               : 8.21.0
pandas_datareader     : None
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : 4.12.3
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : None
gcsfs                 : None
matplotlib            : 3.8.2
numba                 : None
numexpr               : None
odfpy                 : None
openpyxl              : 3.1.2
pandas_gbq            : None
pyarrow               : None
pyreadstat            : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : 1.12.0
sqlalchemy            : None
tables                : None
tabulate              : None
xarray                : None
xlrd                  : None
zstandard             : None
tzdata                : 2024.1
qtpy                  : None
pyqt5                 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions