API: infer_dtype with skipna=True only skip valid-for-dtype NAs

I'm working on making lib.infer_dtype copy-free and finding things would be easier/more consistent if we tweaked the meaning of the skipna keyword.

In particular, instead of doing `values = values[~isnaobj(values)]`, followed by e.g. `is_string_array(values)`, we could do `is_string_array(values, skipna=skipna)`.  This would change the results in cases where we have NA values that are not considered valid_na by is_string_array, e.g. in the status quo:

```
import pandas as pd
import numpy as np
from pandas._libs import lib

arr = np.array(["foo", pd.NaT, "bar"], dtype=object)

In [2]: lib.infer_dtype(arr, skipna=True)
Out[2]: 'string'

In [3]: lib.is_string_array(arr, skipna=True)
Out[3]: False
```

So the suggestion here is to change [2] to give 'mixed'.  I'm finding that to make this work without breaking the world we also need to change StringValidator.is_valid_null to accept np.nan and None (currently just accepts pd.NA)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API: infer_dtype with skipna=True only skip valid-for-dtype NAs #45022

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API: infer_dtype with skipna=True only skip valid-for-dtype NAs #45022

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions