Skip to content

BUG: fix bool no-op replace calls #6339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 13, 2014
Merged

BUG: fix bool no-op replace calls #6339

merged 4 commits into from
Feb 13, 2014

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Feb 13, 2014

closes #6332

values = np.array(values_to_mask)
cant_cast = not np.can_cast(values.dtype, arr.dtype, casting='safe')

if cant_cast and arr.dtype == np.bool_:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is true, then isn't the result a noop? (regardless if its a bool or not?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. the values_to_mask must not be castable specifically to bool and arr.dtype must be be bool, that's what the arr.dtype == np.bool_ is there for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and you don't want to have a BoolBlock handler instead of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually yes you're right ... don't know why i didn't of this before but i can just override putmask in boolblock and check for the new dtype there

@cpcloud
Copy link
Member Author

cpcloud commented Feb 13, 2014

@jreback look okay to u?

@jreback
Copy link
Contributor

jreback commented Feb 13, 2014

why does this work?

@cpcloud
Copy link
Member Author

cpcloud commented Feb 13, 2014

For the same reason I mentioned in #6332. np.atleast_1d is a call to array plus some insurance that you get at least a 1D array. Using the SO example, that will create an array of one string. S dtypes cannot be safely cast to bool therefore I pass in the array of strings which will ultimately raise ValueError in mask_missing which will convert them to object dtype. Then mask_missing procedes to compare each non NA value with those in arr. Since no strings are equal to True or False, the mask is an array of Falses. Does that suffice?

@cpcloud
Copy link
Member Author

cpcloud commented Feb 13, 2014

dicts work too

In [18]: s
Out[18]:
0     True
1    False
2     True
dtype: bool

In [19]: s.replace({'asdf': 'asdb', True: 'a'})
Out[19]:
0        a
1    False
2        a
dtype: object

In [20]: s.replace({False: 'asdb', True: 'a'})
Out[20]:
0       a
1    asdb
2       a
dtype: object

If there's no test I'll add one

@cpcloud
Copy link
Member Author

cpcloud commented Feb 13, 2014

One thing that won't work as "expected" is df.replace({True: False, False: True}). The keys are iterated over in an arbitrary order and you'll end up with either all Trues or all Falses. I don't really think this is that common.

@cpcloud cpcloud self-assigned this Feb 13, 2014
@jreback
Copy link
Contributor

jreback commented Feb 13, 2014

merge away

neat trick with atleast1 d

cpcloud added a commit that referenced this pull request Feb 13, 2014
BUG: fix bool no-op replace calls
@cpcloud cpcloud merged commit 3e54611 into pandas-dev:master Feb 13, 2014
@cpcloud cpcloud deleted the replace-bool-fix branch February 13, 2014 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: replace should handle Bool blocks correctly
2 participants