API: dtype_backend and constructors long-term

On today's dev call we discussed the dtype_backend global option (decided to revert for 2.0) and the use_nullable_dtypes keyword in IO methods (decided to change to dtype_backend, except where it already exists in 1.5 where it will be deprecated in favor of dtype_backend).

Some of the reasoning centered around the fact that constructors do not currently recognize the dtype_backend option, and there was an offhand reference to adding that as a keyword to the constructors.  I think we should avoid adding keywords/options when there are viable alternatives.  Going one step further: long term I think *we need neither a dtype_backend option nor keyword anywhere.*

For IO functions, the "engine" keyword (where applicable) should determine what kind of dtypes you get back.  The most relevant case is `engine="pyarrow"`.  If you want something else, you can use `obj.convert_dtypes(...)`.

For constructors, the "dtype" keyword should be sufficient in _most_ cases.  The two cases where it is not are 

a) dtype=None
The natural thing to do is infer based on the data.  If the data is already a pandas object we retain the dtype.  If it is a numpy object we use a numpy dtype.  If it is a pyarrow object we use pd.ArrowDtype.  That leaves cases where it is e.g. a list.  For that we could plausibly use a global option, but it'd be simpler to just have a sensible default and tell users to use convert_dtypes (or pass a keyword) if desired.

b) dtype=int|"int"|"int64"|... where we could plausibly default to either np.int64 or pa.int64.  As above, could have a global option but better to just have a sensible default and tell users to be more specific or use convert_dtypes.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API: dtype_backend and constructors long-term #51846

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API: dtype_backend and constructors long-term #51846

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions