[perf] JSON encoding can be faster by skipping string building. #2393

drodriguez · 2019-07-05T22:48:05Z

I realized that one of the test related to CharacterSet was very slow in
constrained devices, but also relatively slow in beefy Linux machines
(taking a couple of seconds to perform back and forth JSON
encoding/decoding trips in memory). The test in question have to embed
relatively large Data in the JSON string (around 128KB, if I remember
correctly), which are serialized as Base64.

When serializing the data, the serializer asks for the String
representation of each object, and then appends all of them into a
String, to later transform into an UTF-8 C string copied into a Data.
One can improve performance by appending the UTF-8 representation of
each piece of the JSON serialization instead of the String building
code, since appending [UInt8] without doing all the checking that String
will do is faster for this case.

I realized that one of the test related to CharacterSet was very slow in constrained devices, but also relatively slow in beefy Linux machines (taking a couple of seconds to perform back and forth JSON encoding/decoding trips in memory). The test in question have to embed relatively large Data in the JSON string (around 128KB, if I remember correctly), which are serialized as Base64. When serializing the data, the serializer ask for the String representation of each object, and then appends all of them into a String, to later transform into an UTF-8 C string copied into a Data. One can improve performance by appending the UTF-8 representation of each piece of the JSON serialization instead of the String building code, since appending [UInt8] without doing all the checking that String will do is faster for this case.

drodriguez · 2019-07-05T22:48:18Z

@swift-ci please test

drodriguez · 2019-07-06T00:42:07Z

macOS seems to still fail with some Xcode error.

ianpartridge · 2019-07-06T07:09:47Z

Cool. How much faster is it?

drodriguez · 2019-07-07T03:20:59Z

IIRC the constrained device I was using reduced from 42 seconds to 32 seconds. I think this was all the TestCodable/test_CharacterSet_JSON. Those times were building without stdlib assertions enabled. With those, the test seemed not to finish in the same device (that was why I noticed). In a beefy Linux machine, but with the stdlib assertions, I think the time changed from 7 to 6 seconds, and it was not really significant without the stdlib assertions enabled.

millenomi · 2019-07-15T20:45:08Z

cc @bendjones — can you take a quick look here?

millenomi · 2019-07-15T20:45:31Z

And: is it worth it to port it to the overlay?

millenomi · 2019-07-17T19:39:14Z

Per offline discussion with @bendjones — yup; for the overlay, the perf characteristics of String's .utf8 may not make this a win, depending on what goes on.

bendjones · 2019-07-17T19:39:41Z

Seems fine for SCF to me and a reasonable change. As to the porting to the overlay question I’d just want to make sure the UTF8 guarantees are the same for bridged types (I think it’s fine but want to make sure and measure)

bendjones · 2019-07-17T20:25:21Z

Foundation/JSONSerialization.swift


        var writer = JSONWriter(
            pretty: opt.contains(.prettyPrinted),
            sortedKeys: opt.contains(.sortedKeys),
            writer: { (str: String?) in
                if let str = str {
-                    jsonStr.append(str)
+                    jsonStr.append(contentsOf: Array(str.utf8))


I missed this when I scanned this earlier but you could be able to avoid the array allocation by just doing jsonStr.append(contentsOf: str.utf8) right?

Yes, it seems it should work. I will prepare a follow up. Thanks for pointing it out!

Array.append(contentsOf:) supports any Sequence, and String.UTF8View is a sequence, so there's no need to create an Array from it to append it. From a post-merge feedback in swiftlang#2393.

millenomi merged commit b506a57 into swiftlang:master Jul 17, 2019

bendjones reviewed Jul 17, 2019

View reviewed changes

drodriguez deleted the perf-json-encode branch July 17, 2019 21:42

drodriguez mentioned this pull request Jul 17, 2019

Remove unnecessary Array copy. #2420

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[perf] JSON encoding can be faster by skipping string building. #2393

[perf] JSON encoding can be faster by skipping string building. #2393

Uh oh!

drodriguez commented Jul 5, 2019

Uh oh!

drodriguez commented Jul 5, 2019

Uh oh!

drodriguez commented Jul 6, 2019

Uh oh!

ianpartridge commented Jul 6, 2019

Uh oh!

drodriguez commented Jul 7, 2019

Uh oh!

millenomi commented Jul 15, 2019

Uh oh!

millenomi commented Jul 15, 2019

Uh oh!

millenomi commented Jul 17, 2019

Uh oh!

bendjones commented Jul 17, 2019

Uh oh!

bendjones Jul 17, 2019

Uh oh!

drodriguez Jul 17, 2019

Uh oh!

Uh oh!

[perf] JSON encoding can be faster by skipping string building. #2393

[perf] JSON encoding can be faster by skipping string building. #2393

Uh oh!

Conversation

drodriguez commented Jul 5, 2019

Uh oh!

drodriguez commented Jul 5, 2019

Uh oh!

drodriguez commented Jul 6, 2019

Uh oh!

ianpartridge commented Jul 6, 2019

Uh oh!

drodriguez commented Jul 7, 2019

Uh oh!

millenomi commented Jul 15, 2019

Uh oh!

millenomi commented Jul 15, 2019

Uh oh!

millenomi commented Jul 17, 2019

Uh oh!

bendjones commented Jul 17, 2019

Uh oh!

bendjones Jul 17, 2019

Choose a reason for hiding this comment

Uh oh!

drodriguez Jul 17, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!