Skip to content

BUG: Series __setitem__ gives wrong result with bool indexer #30580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -858,6 +858,7 @@ Indexing
- Bug when indexing with ``.loc`` where the index was a :class:`CategoricalIndex` with non-string categories didn't work (:issue:`17569`, :issue:`30225`)
- :meth:`Index.get_indexer_non_unique` could fail with `TypeError` in some cases, such as when searching for ints in a string index (:issue:`28257`)
- Bug in :meth:`Float64Index.get_loc` incorrectly raising ``TypeError`` instead of ``KeyError`` (:issue:`29189`)
- Bug in :meth:`Series.__setitem__` incorrectly assigning values with boolean indexer when the length of new data matches the number of ``True`` values and new data is not a ``Series`` or an ``np.array`` (:issue:`30567`)

Missing
^^^^^^^
Expand Down
15 changes: 7 additions & 8 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -944,15 +944,14 @@ def putmask(self, mask, new, align=True, inplace=False, axis=0, transpose=False)
and np.any(mask[mask])
and getattr(new, "ndim", 1) == 1
):

if not (
mask.shape[-1] == len(new)
or mask[mask].shape[-1] == len(new)
or len(new) == 1
):
if mask[mask].shape[-1] == len(new): # GH 30567
np.place(new_values, mask, new)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had to look up what np.place was. is there a more common idiom we can use here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any close alternative from numpy. I think np.place in the right choice as it does exactly what we need here (it places all given values to the masked locations rather than assigning masked values to masked locations). Besides, it is designed to specifically work with book masks and that's what we are dealing with here.

elif mask.shape[-1] == len(new) or len(new) == 1:
np.putmask(new_values, mask, new)
else:
raise ValueError("cannot assign mismatch length to masked array")

np.putmask(new_values, mask, new)
else:
np.putmask(new_values, mask, new)

# maybe upcast me
elif mask.any():
Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1190,3 +1190,13 @@ def test_duplicate_index_mistyped_key_raises_keyerror():

with pytest.raises(KeyError):
ser.index._engine.get_loc(None)


def test_setitem_with_bool_mask_and_values_matching_n_trues_in_length():
# GH 30567
ser = pd.Series([None] * 10)
mask = [False] * 3 + [True] * 5 + [False] * 2
ser[mask] = range(5)
result = ser
expected = pd.Series([None] * 3 + list(range(5)) + [None] * 2).astype("object")
tm.assert_series_equal(result, expected)