Skip to content

BUG: Calling Series.astype('category') on a categorical series loaded using pd.read_pickle errors on pandas-0.25.0rc0  #27295

Closed
@rwedge

Description

@rwedge

Code Sample, a copy-pastable example if possible

import os

import pandas as pd

s = pd.Series(["a", "b", "c", "a"], dtype="category")
s.astype('category')

FILEPATH = 'example.pickle'
s.to_pickle(FILEPATH)

s = pd.read_pickle(FILEPATH)
os.remove(FILEPATH)
s.astype('category')

Output

Traceback (most recent call last):
  File "mre.py", line 13, in <module>
    s.astype('category')
  File "/Users/roy/pandas/pandas/core/generic.py", line 5935, in astype
    dtype=dtype, copy=copy, errors=errors, **kwargs
  File "/Users/roy/pandas/pandas/core/internals/managers.py", line 581, in astype
    return self.apply("astype", dtype=dtype, **kwargs)
  File "/Users/roy/pandas/pandas/core/internals/managers.py", line 438, in apply
    applied = getattr(b, f)(**kwargs)
  File "/Users/roy/pandas/pandas/core/internals/blocks.py", line 555, in astype
    return self._astype(dtype, copy=copy, errors=errors, values=values, **kwargs)
  File "/Users/roy/pandas/pandas/core/internals/blocks.py", line 606, in _astype
    return self.make_block(self.values.astype(dtype, copy=copy))
  File "/Users/roy/pandas/pandas/core/arrays/categorical.py", line 524, in astype
    self = self.copy() if copy else self
  File "/Users/roy/pandas/pandas/core/arrays/categorical.py", line 503, in copy
    values=self._codes.copy(), dtype=self.dtype, fastpath=True
  File "/Users/roy/pandas/pandas/core/arrays/categorical.py", line 353, in __init__
    self._dtype = self._dtype.update_dtype(dtype)
  File "/Users/roy/pandas/pandas/core/dtypes/dtypes.py", line 556, in update_dtype
    new_ordered_from_sentinel = dtype._ordered_from_sentinel
AttributeError: 'CategoricalDtype' object has no attribute '_ordered_from_sentinel'

Problem description

Calling Series.astype('category') on a categorical series loaded using pd.read_pickle errors with pandas 0.25.0rc0. The example code ran without error using pandas 0.24.2

Expected Output

0    a
1    b
2    c
3    a
dtype: category

Output of pd.show_versions()

INSTALLED VERSIONS

commit : c64c9cb
python : 3.7.3.final.0
python-bits : 64
OS : Darwin
OS-release : 18.6.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.0rc0+23.gc64c9cb44
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.12
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BlockerBlocking issue or pull request for an upcoming releaseCategoricalCategorical Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions