Skip to content

REF: avoid _internal_get_values in json get_values #32754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 17, 2020

Conversation

jbrockmendel
Copy link
Member

cc @WillAyd

As mentioned in #32731, this changes the behavior for CategoricalIndex[dt64tz] and Series with that for its index. Suggestions for where tests for that should go?

Note: the Categorical[dt64tz] cases still do a trip through object-dtype; I don't want to futz with that in C-space.

@WillAyd
Copy link
Member

WillAyd commented Mar 16, 2020

Suggestions for where tests for that should go?

Pretty much everything right now goes into test_pandas; there's a PR to start fixturing / splitting things that I promise I'll get back to...but for now just place there

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplification!

@@ -222,13 +222,19 @@ static PyObject *get_values(PyObject *obj) {

PRINTMARK();

if (PyObject_HasAttrString(obj, "_internal_get_values")) {
if (PyObject_TypeCheck(obj, cls_index) || PyObject_TypeCheck(obj, cls_series)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess will see on test cases, but is this really required? AFAICT on master _internal_get_values and values return the same for a dt64tz array:

>>> dts = pd.date_range(start="2020-01-01", freq="D", periods=3, tz="US/Pacific")
>>> dts._internal_get_values()
array(['2020-01-01T08:00:00.000000000', '2020-01-02T08:00:00.000000000',
       '2020-01-03T08:00:00.000000000'], dtype='datetime64[ns]')
>>> dts.values
array(['2020-01-01T08:00:00.000000000', '2020-01-02T08:00:00.000000000',
       '2020-01-03T08:00:00.000000000'], dtype='datetime64[ns]')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its Categorical[dt64tz] and Sparse that make the extra __array__ call necessary

@WillAyd WillAyd added Clean IO JSON read_json, to_json, json_normalize labels Mar 16, 2020
@jreback jreback added this to the 1.1 milestone Mar 17, 2020
@jreback
Copy link
Contributor

jreback commented Mar 17, 2020

lgtm. happy if you guys are.

@WillAyd WillAyd merged commit 402c5cd into pandas-dev:master Mar 17, 2020
@WillAyd
Copy link
Member

WillAyd commented Mar 17, 2020

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the json-values-2 branch March 17, 2020 02:21
SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants