Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
a = pd.Series(["a", None, "c"], dtype="string")
b = pd.Series([None, "b", None], dtype="string")
a.update(b)
results in:
Traceback (most recent call last):
File "<ipython-input-15-b9da8f25067a>", line 1, in <module>
a.update(b)
File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\series.py", line 2810, in update
self._data = self._data.putmask(mask=mask, new=other, inplace=True)
File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\internals\managers.py", line 564, in putmask
return self.apply("putmask", **kwargs)
File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\internals\managers.py", line 442, in apply
applied = getattr(b, f)(**kwargs)
File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\internals\blocks.py", line 1676, in putmask
new_values[mask] = new
File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\arrays\string_.py", line 248, in __setitem__
super().__setitem__(key, value)
File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\arrays\numpy_.py", line 252, in __setitem__
self._ndarray[key] = value
ValueError: NumPy boolean array indexing assignment cannot assign 3 input values to the 1 output values where the mask is true
Problem description
The example works if I leave off the dtype="string"
(resulting in the implicit dtype object
).
IMO update should work for all dtypes, not only the "old" ones.
a = pd.Series([1, None, 3], dtype="Int16")
etc. also raises ValueError, while the same with dtype="float64"
works.
It looks as if update doesn't work with the new nullable dtypes (the ones with pd.NA
).
Expected Output
The expected result is that a.update(b)
updates a
without raising an exception, not only for object
and float64
, but also for string
and Int16
etc..
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.7.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 ..., GenuineIntel
...
pandas : 1.0.3
numpy : 1.18.1
...