-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: Fixed PandasArray.__setitem__ with str #28119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
f0718fa
75c2c58
f482d99
6b8dfe2
1828890
9d77af5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
from pandas.util._decorators import Appender | ||
from pandas.util._validators import validate_fillna_kwargs | ||
|
||
from pandas.core.dtypes.common import is_object_dtype | ||
from pandas.core.dtypes.dtypes import ExtensionDtype | ||
from pandas.core.dtypes.generic import ABCIndexClass, ABCSeries | ||
from pandas.core.dtypes.inference import is_array_like, is_list_like | ||
|
@@ -236,7 +237,13 @@ def __setitem__(self, key, value): | |
value = np.asarray(value) | ||
|
||
values = self._ndarray | ||
t = np.result_type(value, values) | ||
if isinstance(value, str): | ||
if is_object_dtype(self.dtype._dtype): | ||
t = np.dtype(object) | ||
else: | ||
t = self.dtype._dtype | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have a test that hits this branch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. simpler to leave the original code then just convert a np.str to no.object (which is what we do inside blocks manager and other places); maybe have a function to do this rather than rewriting logic all over the place There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that's appropriate for PandasArray. The idea is to take an arbitrary numpy array and box it in an extension array. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and that’s exactly what is done in ObjectBlock now pls refactor rather than adding logic There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I doubt it. I think I was mimicking the behavior of In [4]: x = np.array([1, 2, 3])
In [5]: s = pd.Series(x)
In [6]: s.values is x
Out[6]: True
In [7]: s[0] = 'a'
In [8]: s.values is x
Out[8]: False But I'm happy to be stricter here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That said, we'll also inherit things like In [11]: x = np.array([1, 2, 3])
In [12]: x[0] = 5.5
In [13]: x
Out[13]: array([5, 2, 3]) But maybe that's OK, if the intent is to be close to NumPy here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To what extent can we punt on the float treatment for now? I think there's a case to be made that we should raise instead of casting there, but don't want to bog this down any more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm I think our options are to always raise when the dtypes don't match, or adopt NumPy's behavior. I don't think I have a preference. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The thought that pushes me towards raising is that if/when this is backing a Block, we want |
||
else: | ||
t = np.result_type(value, values) | ||
if t != self._ndarray.dtype: | ||
values = values.astype(t, casting="safe") | ||
values[key] = value | ||
|
Uh oh!
There was an error while loading. Please reload this page.