Skip to content

IntervalIndex does not accept CategoricalIndex (of interval dtype) #21243

Closed
@toobaz

Description

@toobaz

Code Sample, a copy-pastable example if possible

In [2]: pd.qcut(range(100), 10).value_counts().index.categories.dtype
Out[2]: interval[float64]

In [3]: pd.IntervalIndex(pd.qcut(range(100), 10).value_counts().index)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-32e3bb1a4f49> in <module>()
----> 1 pd.IntervalIndex(pd.qcut(range(100), 10).value_counts().index)

~/nobackup/repo/pandas/pandas/core/indexes/interval.py in __new__(cls, data, closed, dtype, copy, name, fastpath, verify_integrity)
    238 
    239             data = maybe_convert_platform_interval(data)
--> 240             left, right, infer_closed = intervals_to_interval_bounds(data)
    241 
    242             if (com._all_not_none(closed, infer_closed) and

TypeError: Argument 'intervals' has incorrect type (expected numpy.ndarray, got CategoricalIndex)

Problem description

From a practical point of view, the above is just handy. From a conceptual point of view, the fact that a Series or Index has categorical dtype should be irrelevant for all operations which do not directly concern categories, so again, the above should work the same as if it was not a CategoricalIndex.

I think the following also reflects the same problem:

In [5]: pd.IntervalIndex(pd.qcut(range(100), 10))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-3744b32bfad8> in <module>()
----> 1 pd.IntervalIndex(pd.qcut(range(100), 10))

~/nobackup/repo/pandas/pandas/core/indexes/interval.py in __new__(cls, data, closed, dtype, copy, name, fastpath, verify_integrity)
    238 
    239             data = maybe_convert_platform_interval(data)
--> 240             left, right, infer_closed = intervals_to_interval_bounds(data)
    241 
    242             if (com._all_not_none(closed, infer_closed) and

TypeError: Argument 'intervals' has incorrect type (expected numpy.ndarray, got Categorical)

... and hence, also passing index.values raises an error.

Expected Output

An IntervalIndex with the same content of the CategoricalIndex.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.24.0.dev0+25.gcd0447102
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.25.2
numpy: 1.14.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.2.2.post1153+gff6786446
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeIndexingRelated to indexing on series/frames, not to indexes themselvesIntervalInterval data typeRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions