-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Fix #15344 by backporting ujson usage of PEP 393 API #15360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix #15344 by backporting ujson usage of PEP 393 API #15360
Conversation
lgtm. can you add a whatsnew note? (Bug fix is fine) |
…for compact ascii
17290c9
to
4e8e2ff
Compare
https://travis-ci.org/pandas-dev/pandas/jobs/200143310 we have a c-linter :< |
Ouch :-). Hopefully fixed now. |
thanks @tobgu keep em coming! I am still puzzled why the encoding of a new UTF8 string, that returns a new reference is affecting existing references. Though this maybe some deep things going on that I don't understand. If you have time I would search / post an issue on CPython to understand the semantics here. Its also possible that this is the reason this function is being deprecated (because of these side effects).
|
I think the culprit in this case is the It seems like there's actually an allocation for a full strlen*4 byte unicode string here that would be the root cause of the observed behaviour: Haven't dug into it further than that... Thanks @jreback and @chris-b1 for helping out with this! It would have been a show stopper for the application that I'm currently working on. Hope to see it released soon! :-) |
thanks for looking @tobgu, that is some nasty code. |
Make use of the PEP 393 API to avoid expanding single byte ascii characters into four byte unicode characters when encoding objects to json. closes pandas-dev#15344 Author: Tobias Gustafsson <tobias.l.gustafsson@gmail.com> Closes pandas-dev#15360 from tobgu/backport-ujson-compact-ascii-encoding and squashes the following commits: 44de133 [Tobias Gustafsson] Fix C-code formatting to pass linting of GH15344 b7e404f [Tobias Gustafsson] Merge branch 'master' into backport-ujson-compact-ascii-encoding 4e8e2ff [Tobias Gustafsson] BUG: Fix pandas-dev#15344 by backporting ujson usage of PEP 393 APIs for compact ascii
Make use of the PEP 393 API to avoid expanding single byte ascii characters into four byte unicode characters when encoding objects to json.
git diff upstream/master | flake8 --diff