Skip to content

Categorical handles imaginary values incorrectly #16399

Closed
@TomAugspurger

Description

@TomAugspurger
In [3]: import pandas as pd

In [4]: pd.Categorical([1, 2 + 2j])
Out[4]:
[1.0, 2.0]
Categories (2, float64): [1.0, 2.0]

In [6]: pd.Categorical([1, 2, 2 + 2j])
Out[6]:
[1.0, 2.0, 2.0]
Categories (2, float64): [1.0, 2.0]  # Missing a `2 + 2j` category

I think it comes down to factorize

In [9]: pd.factorize([1, 2, 2 + 1j])
Out[9]: (array([0, 1, 1]), array([ 1.,  2.]))

Clearly not much demand for this, but figured I'd park an issue anyway. Discovered this in #16015, which will not be fixing it.

Metadata

Metadata

Assignees

Labels

CategoricalCategorical Data TypeComplexComplex NumbersNeeds TestsUnit test(s) needed to prevent regressions

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions