Closed
Description
I ran into this while working on #35078. Here's a simple reproducer:
In [1]: from pandas.core.arrays import Categorical
In [2]: Categorical([1, 2, 3,4,4,4, None, None])
Out[2]:
[1, 2, 3, 4, 4, 4, NaN, NaN]
Categories (4, int64): [1, 2, 3, 4]
Do people think it's worth having a dropna
argument for Categorical, so that one could do:
In [2]: Categorical([1, 2, 3,4,4,4, None, None], dropna=True)
Out[2]:
[1, 2, 3, 4, 4, 4, NaN, NaN]
Categories (4, int64): [1, 2, 3, 4, NaN]
If we set dropna=False
by default presumably there shouldn't be backward compatibility issues.
I can see arguments either way! In case we do want to add this I'm happy work on it (and the solution in #35078 will be a lot cleaner)