API: common dtype for bool + numeric: upcast to object or coerce to numeric? 

We currently have an inconsistency in how we determine the common dtype for bool + numeric. 

Numpy coerces booleans to numeric values when combining with numeric dtype:

```python
>> np.concatenate([np.array([True]), np.array([1])])
array([1, 1])
```

In pandas, Series does the same:

```python
>>> pd.concat([pd.Series([True], dtype=bool), pd.Series([1.0], dtype=float)])
0    1.0
0    1.0
dtype: float64
```

except if they are empty, then we ensure the result is object dtype:

```python
>>> pd.concat([pd.Series([], dtype=bool), pd.Series([], dtype=float)])
Series([], dtype: object)
```

And for DataFrame we return object dtype in all cases:

```python
>>> pd.concat([pd.DataFrame({'a': np.array([], dtype=bool)}), pd.DataFrame({'a': np.array([], dtype=float)})]).dtypes
a    object
dtype: object
>>> pd.concat([pd.DataFrame({'a': np.array([True], dtype=bool)}), pd.DataFrame({'a': np.array([1.0], dtype=float)})]).dtypes
a    object
dtype: object
```

For the nullable dtypes, we also have implemented this a bit inconsistently up to now:

```python
>>> pd.concat([pd.Series([True], dtype="boolean"), pd.Series([1], dtype="Int64")])
0    1
0    1
dtype: Int64

>>> pd.concat([pd.Series([True], dtype="boolean"), pd.Series([1], dtype="Float64")])
0    True
0     1.0
dtype: object
```

So here we preserve numeric dtype for Integer, but convert to object for Float. Now, the reason for this is because `IntegerDtype._get_common_dtype` handles the case of boolean dtype and then uses the numpy rules to determine the result dtype, while the FloatingDtype doesn't yet handle non-float dtypes and thus results in object dtype for anything else (also for float numpy dtype, which is obviously a bug / missing feature)

---

Basically we need to decide what the desired behaviour is for the bool + numeric dtype combination: coerce to numeric or upcast to object? (and then fix the inconsistencies according to the decided rule)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API: common dtype for bool + numeric: upcast to object or coerce to numeric? #39817

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API: common dtype for bool + numeric: upcast to object or coerce to numeric? #39817

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions