-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
CLN: .values -> ._values #34083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: .values -> ._values #34083
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not fully convinced we should also use _values
for DataFrame, but you might have your reasons, if can explain?
pandas/core/apply.py
Outdated
def values(self): | ||
return self.obj.values | ||
def values(self) -> np.ndarray: | ||
return self.obj._values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obj
is a DataFrame here, so is there a good reason to change values
to _values
? (in practice, it's actually _values
that is implemented in terms of values
)
The reason to do this for Series (_values
being the "internal" values) doesn't really apply for DataFrame?
The process of going through the code looking for problematic usages of .values is grep-based, so doing this change in places where they are equivalent trims the problem for future passes. |
I am not the one making those changes, so it is easier for me to say, but: I am not convinced this is worth the easier grepping, as this change makes reading the code more confusing IMO. For me, Personally, I would rather remove the If the use of |
this looks ok to me. |
@jbrockmendel what do you think about using |
In cases where |
That's fine for me as well, I don't have a strong preference, it was just as a suggestion to have an alternative for |
That's a reasonable thought. One more pitch for ._values before punting on this: I use the heuristic: "when doing this with Series, would i use _values?" If so, using it in the DataFrame context is more internally consistent and potentially amenable to code-sharing. If I see |
It's only consistent in "naming", but not in what it does.
Do you know if for example in this PR there are actually cases of this? (where the object in question could be both Series or DataFrame). We certainly have (had) things like |
@jorisvandenbossche how strong are your opinions on this? if they're a blocker, ill revert the DataFrame pieces so we can push the rest through |
Or directly use I still think |
we've already established this as a pattern i dislike. the status quo seems to be the only point of agreement |
It was you who proposed |
I proposed asarray specifically for those cases where we want an ndarray and only an ndarray |
Which is what |
We're going in circles. I'm going to revert the edits of DataFrame.values->DataFrame._values. |
I think it is useful to try to clear up our misunderstanding, as it will come up again.
So do I read this correct that you are OK with using But then, for a DataFrame, the only return values can be an ndarray, so for dataframes we know we always want / will get an ndarray? So doesn't fall that under the above description of cases where we can use |
I'd prefer to punt on this until then.
I'm thinking about a future where we may have 2D EAs. |
controversial parts reverted; green |
k thanks |
No description provided.