Skip to content

Categorical.replace() unexpectedly returns non-categorical #18250

Closed
@toobaz

Description

@toobaz

Code Sample, a copy-pastable example if possible

In [2]: pd.Series([1, 2, 3, 2], dtype='category').replace(2, 1) # good
Out[2]: 
0    1
1    1
2    3
3    1
dtype: category
Categories (3, int64): [1, 2, 3]

In [3]: pd.Series([1, 2, 3, 2], dtype='category').replace(2, 15) # bad
Out[3]: 
0     1
1    15
2     3
3    15
dtype: int64

Problem description

I think replace() called with unknown category should either raise an error, or just replace the element in the list of categories.

The current behavior is even more annoying if you do

In [4]: pd.Series([1, 2, np.nan, 2], dtype='category').replace(np.nan, 15)
Out[4]: 
0     1.0
1     2.0
2    15.0
3     2.0
dtype: float64

because precisely the operation of removing NaNs from integers makes then floats! Notice, for comparison, that pd.Series([1, 2, np.nan, 2], dtype='category').fillna(15) raises a ValueError.

This is related to (the discussion about) #18185 .

Expected Output

In [3]: pd.Series([1, 2, 3, 2], dtype='category').replace(2, 15)
Out[3]: 
0    1
1    15
2    3
3    15
dtype: category
Categories (3, int64): [1, 3, 15]

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 9e3ad63
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.22.0.dev0+114.g9e3ad63cd
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.7.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions