Skip to content

PERF: change impl for Categorical to use smaller dtype arrays #8453

Closed
@jreback

Description

@jreback

So it seems by using a full Int64 array for the codes, plus the categorires we are actually using MORE memory to store a Categorical. Because the pointers are the same sized as an object array (plus have the categories).

So need to change the codes store to use a smaller dtype of int. Maybe switch this to a plain ndarray, and use dtype=uint8. Would provide a lot of benefit

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions