DISCUSS: boolean dtype with missing value support

Part of the discussion on missing value handling in https://github.com/pandas-dev/pandas/issues/28095, detailed proposal at https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB.

*if* we go for a new NA value, we also need to decide the behaviour of this value in comparison operations. And consequently, we also need to decide on the behaviour of boolean values with missing data in logical operations and indexing operations.  
So let's use this issue for that part of the discussion.

Some aspects of this:

- Behaviour in **comparison operations**: currently np.nan compares unequal (`value == np.nan -> False`, `values > np.nan -> False`, but we can also propagate missing values (`value == NA -> NA`, ...)
-  Behaviour in **logical operations**: currently we always return False for `|` or `&` with missing data. But we could also use a "three-valued logic" like [Julia](https://docs.julialang.org/en/v1/manual/missing/index.html#Logical-operators-1) and SQL (this has, eg, `NA | True = True` or `NA & True = NA`).
- Behaviour in **indexing**: currently you cannot do boolean indexing with a boolean series with missing values (which is object dtype right now). Do we want to change this? For example, interpret it as False (not select it) 
  (TODO: should check how other languages do this)

Julia has a nice documentation page explain how they support [missing values](https://docs.julialang.org/en/v1/manual/missing/index.html), the above ideas largely match with that.

Besides those behavioural API discussions, we also need to decide on how to approach this technically (boolean ExtensionArray with boolean numpy array + mask for missing values?) Shall we discuss that here as well, or keep that separate?

cc @pandas-dev/pandas-core 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DISCUSS: boolean dtype with missing value support #28778

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DISCUSS: boolean dtype with missing value support #28778

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions