Skip to content

ERR: Segfault with df.astype(category) on empty dataframe #18004

Closed
@jcrist

Description

@jcrist

Pandas segfaults when calling DataFrame.astype('category') on an empty dataframe. This fails in 0.21.0rc1, 0.20.3, and probably previous versions as well.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'x': ['a', 'b', 'c'], 'y': ['a', 'b', 'c'], 'z': ['a', 'b', 'c']})

In [3]: empty = df.iloc[:0].copy()  # copy is necessary, no segfault without it

In [4]: empty.astype('category')
Segmentation fault: 11

For non-empty frames, an error message is raised saying this operation isn't supported yet. Also note that a copy is needed to cause the segfault, without it the error message is still raised.

pd.show_versions:
In [5]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Darwin
OS-release: 16.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.1.0
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.25.2
numpy: 1.13.3
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.5.3
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.5
s3fs: 0.1.1
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeError ReportingIncorrect or improved errors from pandas

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions