Skip to content

API: provide a to_enum method for Categorical dtypes? #16749

Closed
@chris-b1

Description

@chris-b1

xref #16015

This might be something nice to provide. Specific usecase I was thinking of is interacting with numba - it understands enums in nopython mode, so it would be a way to work with Categoricals there.

Partially working implementation below, doing something like namedtuple with exec

def categorical_to_enum(categorical, name):
    cats = categorical.categories

    template = """from enum import IntEnum
class {name}(IntEnum):
    NA = -1
{fields}
"""
    assert all(isinstance(x, str) and x.isidentifier()
               for x in cats)

    fields = '\n'.join('    {cat}={code}'.format(cat=cat, code=i)
                       for i, cat in enumerate(cats))
    ns = {}
    exec(template.format(name=name, fields=fields), ns)
    return ns[name]

# example usage
In [81]: c = pd.Categorical(['a', 'b', 'a', 'c'])

In [82]: MyEnum = categorical_to_enum(c, 'MyEnum')

In [83]: MyEnum.c
Out[83]: <MyEnum.c: 2>

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeEnhancementNeeds DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions