Skip to content

BUG: Pandas DataFrame replace() doesn't work if the dataframe has nullable boolean columns #44499

Closed
@ilCatania

Description

@ilCatania

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd
df = pd.DataFrame({"a": ["x", "y", "z", "w"], "b": [True, False, False, True]}).astype({"b":"boolean"})
df.replace("x", "X")

Issue Description

When trying to use pandas.DataFrame.replace() on a dataframe with several columns including boolean columns where the boolean (i.e. nullable) dtype has been set explicitly beforehand. When running the above snippet I get the following error:

File "C:\users\el_caminooooo\playground\venv/lib/python3.7/site-packages/pandas/core/arrays/boolean.py", line 265, in __init__
    "values should be boolean numpy array. Use "
TypeError: values should be boolean numpy array. Use the 'pd.array' function instead

Is this a pandas bug?. Also please note both of the following work fine:

import pandas as pd
df = pd.DataFrame({"a": ["x", "y", "z", "w"], "b": [True, False, False, True]}).astype({"b":"bool"})  # notice 'bool' instead of 'boolean'
df.replace("x", "X")

and:

df = pd.DataFrame({"a": ["x", "y", "z", "w"], "b": [True, False, False, True]})
df.replace("x", "X")

Expected Behavior

the replace should complete successfully

Installed Versions

INSTALLED VERSIONS ------------------ commit : 7c48ff4 python : 3.7.10.final.0 python-bits : 64 OS : Linux OS-release : 3.10.0-957.27.2.el7.x86_64 Version : #1 SMP Mon Jul 29 17:46:05 UTC 2019 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.3.4 numpy : 1.21.0 pytz : 2021.1 dateutil : 2.8.1 pip : 21.3.1 setuptools : 47.1.0 Cython : 0.29.23 pytest : 6.2.5 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.6.3 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.1 IPython : None pandas_datareader: 0.10.0 bs4 : None bottleneck : None fsspec : 2021.08.1 fastparquet : 0.7.0 gcsfs : None matplotlib : 3.4.2 numexpr : None odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 4.0.0 pyxlsb : None s3fs : None scipy : 1.6.3 sqlalchemy : None tables : None tabulate : 0.8.9 xarray : None xlrd : None xlwt : None numba : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNA - MaskedArraysRelated to pd.NA and nullable extension arraysNumeric OperationsArithmetic, Comparison, and Logical operationsreplacereplace method

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions