Skip to content

ENH: should pandas.core.arrays.Categorical have a dropna=False option? #35162

Closed
@arw2019

Description

@arw2019

I ran into this while working on #35078. Here's a simple reproducer:

In [1]: from pandas.core.arrays import Categorical
In [2]: Categorical([1, 2, 3,4,4,4, None, None])                                                                                                                                                    
Out[2]: 
[1, 2, 3, 4, 4, 4, NaN, NaN]
Categories (4, int64): [1, 2, 3, 4]

Do people think it's worth having a dropna argument for Categorical, so that one could do:

In [2]: Categorical([1, 2, 3,4,4,4, None, None], dropna=True)                                                                                                                                                    
Out[2]: 
[1, 2, 3, 4, 4, 4, NaN, NaN]
Categories (4, int64): [1, 2, 3, 4, NaN]

If we set dropna=False by default presumably there shouldn't be backward compatibility issues.

I can see arguments either way! In case we do want to add this I'm happy work on it (and the solution in #35078 will be a lot cleaner)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions