Open
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
print(pd.__version__)
df = pd.DataFrame({"A": ["1", "", "3"]}, dtype="string")
try:
result = df.where(df != "", np.nan)
arr = result["A"]._values
print(arr)
print(type(arr[1]))
except Exception as e:
print(e)
df.where(df != "", np.nan, inplace=True)
print(df)
arr = df["A"]._values
print(arr)
print(type(arr[1]))
Issue Description
code sample based on #46366
1.4.1
StringArray requires a sequence of strings or pandas.NA
A
0 1
1 NaN
2 3
<StringArray>
['1', nan, '3']
Length: 3, dtype: string
<class 'float'>
1.5.0.dev0+595.gf99ec8bf80
<StringArray>
['1', <NA>, '3']
Length: 3, dtype: string
<class 'pandas._libs.missing.NAType'>
A
0 1
1 NaN
2 3
<StringArray>
['1', nan, '3']
Length: 3, dtype: string
<class 'float'>
Expected Behavior
The behavior for the inplace=False
case has changed from 1.4.1 to main since #45168 allows other na values in the StringArray Constructor.
Whether this is correct for the DataFrame.where case may need discussion. Either way, the results for the inplace=True
case look incorrect to me and should be consistent with the inplace=False
case.
Installed Versions
.