Skip to content

pd.DataFrame.replace doesn't work after converting to new dtypes in 1.0.0 #31517

Closed
@scottboston

Description

@scottboston

Code Sample, a copy-pastable example if possible

import pandas as pd
pd.show_versions()
df = pd.DataFrame({'grp':[1,2,3,4,5]})
df = df.convert_dtypes()
df.replace(1,10)

Problem description

pd.DataFrame.replace is not working after using pd.DataFrame.convert_dtypes.

Stack trace:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-fb2a3a449007> in <module>
      3 df = pd.DataFrame({'grp':[1,2,3,4,5]})
      4 df = df.convert_dtypes()
----> 5 df.replace(1,10)

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in replace(self, to_replace, value, inplace, limit, regex, method)
   4167             limit=limit,
   4168             regex=regex,
-> 4169             method=method,
   4170         )
   4171 

~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
   6735                 elif not is_list_like(value):  # NA -> 0
   6736                     new_data = self._data.replace(
-> 6737                         to_replace=to_replace, value=value, inplace=inplace, regex=regex
   6738                     )
   6739                 else:

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/managers.py in replace(self, value, **kwargs)
    587     def replace(self, value, **kwargs):
    588         assert np.ndim(value) == 0, value
--> 589         return self.apply("replace", value=value, **kwargs)
    590 
    591     def replace_list(self, src_list, dest_list, inplace=False, regex=False):

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/managers.py in apply(self, f, filter, **kwargs)
    440                 applied = b.apply(f, **kwargs)
    441             else:
--> 442                 applied = getattr(b, f)(**kwargs)
    443             result_blocks = _extend_blocks(applied, result_blocks)
    444 

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/blocks.py in replace(self, to_replace, value, inplace, filter, regex, convert)
    769 
    770         try:
--> 771             blocks = self.putmask(mask, value, inplace=inplace)
    772             # Note: it is _not_ the case that self._can_hold_element(value)
    773             #  is always true at this point.  In particular, that can fail

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals/blocks.py in putmask(self, mask, new, align, inplace, axis, transpose)
   1671         mask = _safe_reshape(mask, new_values.shape)
   1672 
-> 1673         new_values[mask] = new
   1674         return [self.make_block(values=new_values)]
   1675 

~/anaconda3/lib/python3.6/site-packages/pandas/core/arrays/integer.py in __setitem__(self, key, value)
    415             mask = mask[0]
    416 
--> 417         self._data[key] = value
    418         self._mask[key] = mask
    419 

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices​

Expected Output

Should work just like it does before using pd.DataFrame.convert_dtypes.

#pd.show_versions()
df = pd.DataFrame({'grp':[1,2,3,4,5]})
#df = df.convert_dtypes()
df.replace(1,10)```

Output:
grp
0 10
1 2
2 3
3 4
4 5
#### Output of ``pd.show_versions()``

<details>

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.9.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.15.0-76-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.0.0
numpy            : 1.16.4
pytz             : 2019.2
dateutil         : 2.8.0
pip              : 20.0.1
setuptools       : 45.1.0
Cython           : 0.29
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.4.1
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.10.1
IPython          : 7.8.0
pandas_datareader: None
bs4              : 4.8.1
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.4.1
matplotlib       : 2.2.3
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : None
pyxlsb           : None
s3fs             : None
scipy            : 1.2.1
sqlalchemy       : 1.3.9
tables           : None
tabulate         : 0.8.6
xarray           : None
xlrd             : None
xlwt             : None
xlsxwriter       : None
numba            : None
​

</details>

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNA - MaskedArraysRelated to pd.NA and nullable extension arrays

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions