Skip to content

[perf] JSON encoding can be faster by skipping string building. #2393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 17, 2019

Conversation

drodriguez
Copy link
Contributor

I realized that one of the test related to CharacterSet was very slow in
constrained devices, but also relatively slow in beefy Linux machines
(taking a couple of seconds to perform back and forth JSON
encoding/decoding trips in memory). The test in question have to embed
relatively large Data in the JSON string (around 128KB, if I remember
correctly), which are serialized as Base64.

When serializing the data, the serializer asks for the String
representation of each object, and then appends all of them into a
String, to later transform into an UTF-8 C string copied into a Data.
One can improve performance by appending the UTF-8 representation of
each piece of the JSON serialization instead of the String building
code, since appending [UInt8] without doing all the checking that String
will do is faster for this case.

I realized that one of the test related to CharacterSet was very slow in
constrained devices, but also relatively slow in beefy Linux machines
(taking a couple of seconds to perform back and forth JSON
encoding/decoding trips in memory). The test in question have to embed
relatively large Data in the JSON string (around 128KB, if I remember
correctly), which are serialized as Base64.

When serializing the data, the serializer ask for the String
representation of each object, and then appends all of them into a
String, to later transform into an UTF-8 C string copied into a Data.
One can improve performance by appending the UTF-8 representation of
each piece of the JSON serialization instead of the String building
code, since appending [UInt8] without doing all the checking that String
will do is faster for this case.
@drodriguez
Copy link
Contributor Author

@swift-ci please test

@drodriguez
Copy link
Contributor Author

macOS seems to still fail with some Xcode error.

@ianpartridge
Copy link
Contributor

Cool. How much faster is it?

@drodriguez
Copy link
Contributor Author

IIRC the constrained device I was using reduced from 42 seconds to 32 seconds. I think this was all the TestCodable/test_CharacterSet_JSON. Those times were building without stdlib assertions enabled. With those, the test seemed not to finish in the same device (that was why I noticed). In a beefy Linux machine, but with the stdlib assertions, I think the time changed from 7 to 6 seconds, and it was not really significant without the stdlib assertions enabled.

@millenomi
Copy link
Contributor

cc @bendjones — can you take a quick look here?

@millenomi
Copy link
Contributor

And: is it worth it to port it to the overlay?

@millenomi
Copy link
Contributor

Per offline discussion with @bendjones — yup; for the overlay, the perf characteristics of String's .utf8 may not make this a win, depending on what goes on.

@millenomi millenomi merged commit b506a57 into swiftlang:master Jul 17, 2019
@bendjones
Copy link
Contributor

Seems fine for SCF to me and a reasonable change. As to the porting to the overlay question I’d just want to make sure the UTF8 guarantees are the same for bridged types (I think it’s fine but want to make sure and measure)


var writer = JSONWriter(
pretty: opt.contains(.prettyPrinted),
sortedKeys: opt.contains(.sortedKeys),
writer: { (str: String?) in
if let str = str {
jsonStr.append(str)
jsonStr.append(contentsOf: Array(str.utf8))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this when I scanned this earlier but you could be able to avoid the array allocation by just doing jsonStr.append(contentsOf: str.utf8) right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems it should work. I will prepare a follow up. Thanks for pointing it out!

@drodriguez drodriguez deleted the perf-json-encode branch July 17, 2019 21:42
drodriguez added a commit to drodriguez/swift-corelibs-foundation that referenced this pull request Jul 17, 2019
Array.append(contentsOf:) supports any Sequence, and String.UTF8View is
a sequence, so there's no need to create an Array from it to append it.

From a post-merge feedback in swiftlang#2393.
e78l added a commit to e78l/swift-corelibs-foundation that referenced this pull request Jul 22, 2019
Array.append(contentsOf:) supports any Sequence, and String.UTF8View is
a sequence, so there's no need to create an Array from it to append it.

From a post-merge feedback in swiftlang#2393.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants