From a8083619591fd9c78dabfb3ded99244be98187cd Mon Sep 17 00:00:00 2001 From: Cory Benfield Date: Mon, 8 Mar 2021 10:08:04 +0000 Subject: [PATCH] Fix CoW in HTTPResponseAggregator Motivation: HTTPResponseAggregator attempts to build a single, complete response object. This necessarily means it loads the entire response payload into memory. It wants to provide this payload as a single contiguous buffer of data, and it does so by aggregating the data into a single contiguous buffer as it goes. Because ByteBuffer does exponential reallocation, the cost of doing this should be amortised constant-time, even though we do have to copy some data sometimes. However, if this operation triggers a copy-on-write then the operation will become quadratic. For large buffers this will rapidly come to dominate the runtime. Unfortunately in at least Swift 5.3 Swift cannot safely see that during the body stanza the state variable is dead. Swift is not necessarily wrong about this: there's a cross-module call to ByteBuffer.writeBuffer in place and Swift cannot easily prove that that call will not lead to a re-entrant access of the `HTTPResponseAggregator` object. For this reason, during the call to `didReceiveBodyPart` there will be two copies of the body buffer alive, and so the write will CoW. This quadratic behaviour is a nasty performance trap that can become highly apparent even at quite small body sizes. Modifications: While Swift can't prove that the `self.state` variable is dead, we can! To that end, we temporarily set it to a different value that does not store the buffer in question. This will force Swift to drop the ref on the buffer, making it uniquely owned and avoiding the CoW. Sadly, it's extremely difficult to test for "does not CoW", so this patch does not currently come with any tests. I have experimentally verified the behaviour. Result: No copy-on-write in the HTTPResponseAggregator during body aggregation. --- Sources/AsyncHTTPClient/HTTPHandler.swift | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Sources/AsyncHTTPClient/HTTPHandler.swift b/Sources/AsyncHTTPClient/HTTPHandler.swift index e79ceffbd..c15b64a1e 100644 --- a/Sources/AsyncHTTPClient/HTTPHandler.swift +++ b/Sources/AsyncHTTPClient/HTTPHandler.swift @@ -380,6 +380,11 @@ public class ResponseAccumulator: HTTPClientResponseDelegate { case .head(let head): self.state = .body(head, part) case .body(let head, var body): + // The compiler can't prove that `self.state` is dead here (and it kinda isn't, there's + // a cross-module call in the way) so we need to drop the original reference to `body` in + // `self.state` or we'll get a CoW. To fix that we temporarily set the state to `.end` (which + // has no associated data). We'll fix it at the bottom of this block. + self.state = .end var part = part body.writeBuffer(&part) self.state = .body(head, body)