-
Notifications
You must be signed in to change notification settings - Fork 113
RFC: instrument lambda handler #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
3bfc54f
4fa2e4c
99e2d36
b639901
6609e51
7b42510
fed24c2
50336b6
5ee3cfe
5f9773b
ccb11da
bda0014
f3b4beb
8cc1072
03b745f
dc5c7b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ import Baggage | |
import Dispatch | ||
import Logging | ||
import NIO | ||
import NIOConcurrencyHelpers | ||
|
||
// MARK: - InitializationContext | ||
|
||
|
@@ -52,6 +53,9 @@ extension Lambda { | |
/// Lambda runtime context. | ||
/// The Lambda runtime generates and passes the `Context` to the Lambda handler as an argument. | ||
public final class Context: CustomDebugStringConvertible { | ||
// TODO: use RWLock (separate PR) | ||
private let lock = Lock() | ||
|
||
/// The request ID, which identifies the request that triggered the function invocation. | ||
public let requestID: String | ||
|
||
|
@@ -70,8 +74,14 @@ extension Lambda { | |
/// For invocations from the AWS Mobile SDK, data about the client application and device. | ||
public let clientContext: String? | ||
|
||
/// Context baggage. | ||
public var baggage: BaggageContext | ||
// TODO: or should the Lambda "runtime" context and the Baggage context be separate? | ||
private var _baggage: BaggageContext | ||
|
||
/// Baggage context. | ||
public var baggage: BaggageContext { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this need to be a public There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's currently forced to this via the Carrier's requirement, but it may well be that requirement is quite wrong and should just be |
||
get { self.lock.withLock { _baggage } } | ||
set { self.lock.withLockVoid { _baggage = newValue } } | ||
} | ||
|
||
/// `Logger` to log with | ||
/// | ||
|
@@ -117,9 +127,9 @@ extension Lambda { | |
logger[metadataKey: "awsRequestID"] = .string(requestID) | ||
logger[metadataKey: "awsTraceID"] = .string(traceID) | ||
var baggage = BaggageContext() | ||
// TODO: handle error | ||
// TODO: use `swift-tracing` API, note that, regardless, we can ONLY extract X-Ray Context | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not really; "we" don't decide what we extract, the configuration of instruments decides that, and yes since xray would be configured it'd extract it's own context here. Specifically:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
the problem here is that only the X-Ray trace context is provided by Lambda Runtime API (in header), see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html#runtimes-api-next Context for other instruments may be provided in invocation payload and needs to be extracted by user in lambda event handler they implement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, I missed that bit of the Lambda design. So in this integration style, even if one triggered the lambda via an http request, it would not get “the http request” but just the body, and the headers are the ones as listed on there, including the XRay trace header etc. There AFAIR exists an integration mode though to get the entire request, right?
Is this something that the runtime currently is able to handle? Seems more to be about how the API Gateway is configured right? Though I’ve not had the time to dig deeper into this yet. You’d probably know more about this @fabianfett, we can catch up about this today maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess you are referring to API Gateway Lambda proxy integration This does NOT affect Lambda custom runtime API, it affects API Gateway v1 "REST API" routing which, if configured that way, does not try to route events based on a RESTful model, instead it forwards all events to lambda which needs to resolve HTTP method, path and arguments itself based on the content in the event payload (but it still remains in the event payload -> invocation payload):
Note that:
For reference Integrating AWS X-Ray with other AWS services There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for explaining this in more depth @pokryfka, I need to read up some more here about aws/lambda in general it seems. |
||
baggage.xRayContext = try? XRayContext(tracingHeader: traceID) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would we want to let the users decide what tracer to use, or since this is AWS oriented anyways just pin to x-ray? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that's the main TODO in this PR - not using the xray explicitly. Even if it is bound to xRay in reality, we should make sure to only use the abstract API, maybe maybe some day there would be some other tracer or maybe amazon decide to make their own or something, no idea, but let's keep the door open for future evolution. Using the tracing API also means that while developing locally you could plug in the Instruments(.app) (naming gets confusing...) tracer: slashmo/gsoc-swift-tracing#97 and see spans in Instruments on the mac. Instruments does not really understand / visualize "traces" with parents etc well today... but it's something we can keep in mind, maybe it'll get better at displaying those and then when developing locally you get the same user experience with tracing as on prod :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ktoso I generally agree with you. I'm however a little concerned, that tracing will require manual adjustment (incl. adding another dependency) if we don't include the XRay tracing by default. Lambda tracing wouldn't work out of the box in this case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I get that -- though the lambda runtime "core" should not be instrumented using any specific tracer, regardless of "yes it'll be xray" but maybe some day down the road there's other impls, and you'd want to swap it. We could absolutely though make some "batteries included" package, we should think how to pull that off, wdyt? |
||
self.baggage = baggage | ||
self._baggage = baggage | ||
self.logger = logger | ||
self.tracer = tracer | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -130,7 +130,6 @@ public extension EventLoopLambdaHandler { | |
/// Driver for `ByteBuffer` -> `In` decoding and `Out` -> `ByteBuffer` encoding | ||
func handle(context: Lambda.Context, event: ByteBuffer) -> EventLoopFuture<ByteBuffer?> { | ||
let segment = context.tracer.beginSegment(name: "HandleEvent", baggage: context.baggage) | ||
// TODO: record errors propagated in result types? | ||
let decodedEvent = segment.subsegment(name: "DecodeIn") { _ in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. btw, I've been wondering if we should offer This is quite common I think so I think we could add it... We could also do the same with a NIO extensions package then to handle Future returning blocks 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ktoso +1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do use "helpers" both with closures and NIO futures as referenced here slashmo/gsoc-swift-tracing#125 (comment) |
||
self.decodeIn(buffer: event) | ||
} | ||
|
@@ -140,16 +139,10 @@ public extension EventLoopLambdaHandler { | |
segment.end() | ||
return context.eventLoop.makeFailedFuture(CodecError.requestDecoding(error)) | ||
case .success(let `in`): | ||
// TODO: use NIO helpers? | ||
let subsegment = segment.beginSubsegment(name: "HandleIn") | ||
context.baggage = subsegment.baggage | ||
return self.handle(context: context, event: `in`) | ||
.always { result in | ||
if case .failure(let error) = result { | ||
subsegment.addError(error) | ||
} | ||
subsegment.end() | ||
} | ||
.endSegment(subsegment) | ||
.flatMapThrowing { out in | ||
try context.tracer.segment(name: "EncodeOut", baggage: segment.baggage) { _ in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Naming nitpick: I really would like to get all libs to be consistent with the use of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this will be "context" when using var span = tracer.startSpan(named: "EncodeOut", context: segment.baggage) I dont have strong opinion on that, but to avoid confusion in XRaySDK I use "context" for var baggage = BaggageContext() // empty
let context = XRayContext()
baggage.xRayContext = context
let segment = tracer.beginSegment(name: "EncodeOut", baggage: baggage) // may report missing context
// or
let segment2 = tracer.beginSegment(name: "EncodeOut", context: context) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would you say about the following spelling though:
also because one can use the BaggageContextCarrier to assign through into the underlying baggage context;
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see the point with
though; but that's the segment API, you are free to do what you want there but still I would not recommend using baggage as parameter names, you'd want to accept a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made a ticket to discuss naming once more: slashmo/gsoc-swift-baggage-context#23 |
||
switch self.encodeOut(allocator: context.allocator, value: out) { | ||
|
@@ -159,12 +152,8 @@ public extension EventLoopLambdaHandler { | |
return buffer | ||
} | ||
} | ||
}.always { result in | ||
if case .failure(let error) = result { | ||
segment.addError(error) | ||
} | ||
segment.end() | ||
} | ||
.endSegment(segment) | ||
} | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,22 +18,24 @@ import Dispatch | |
import Logging | ||
import NIO | ||
|
||
// type names defined in `TracingInstrument`, aliases will be removed | ||
public typealias TracingInstrument = XRayRecorder | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see, that confused me for a moment why it's an instrument but used segment() ;) Okey for the PoC, agreed on migrating once "released" though it would be good to use this PR to already migrate and see that the API supports everything one needs here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should make the API transition easier, temporary ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be possible to PoC this out using the real swift-tracing API right away though? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be pretty much straight forward as far API is concerned. I think the missing part is to freeze and release i am trying to keep |
||
public typealias NoOpTracingInstrument = XRayNoOpRecorder | ||
|
||
extension Lambda { | ||
/// LambdaRunner manages the Lambda runtime workflow, or business logic. | ||
internal final class Runner { | ||
private let runtimeClient: RuntimeClient | ||
internal let tracer: TracingInstrument | ||
private let tracer: TracingInstrument | ||
private let eventLoop: EventLoop | ||
private let allocator: ByteBufferAllocator | ||
|
||
private var isGettingNextInvocation = false | ||
|
||
init(eventLoop: EventLoop, configuration: Configuration) { | ||
init(eventLoop: EventLoop, configuration: Configuration, tracer: TracingInstrument) { | ||
self.eventLoop = eventLoop | ||
self.runtimeClient = RuntimeClient(eventLoop: self.eventLoop, configuration: configuration.runtimeEngine) | ||
self.tracer = XRayRecorder(eventLoopGroupProvider: .shared(eventLoop)) | ||
self.runtimeClient = RuntimeClient(eventLoop: self.eventLoop, configuration: configuration.runtimeEngine, tracer: tracer) | ||
self.tracer = tracer | ||
self.allocator = ByteBufferAllocator() | ||
} | ||
|
||
|
@@ -66,7 +68,8 @@ extension Lambda { | |
logger.debug("lambda invocation sequence starting") | ||
// 1. request invocation from lambda runtime engine | ||
self.isGettingNextInvocation = true | ||
// TODO: add API to explicitly set startTime, after all | ||
// we will get the trace context in the invocation | ||
let startTime = XRayRecorder.Timestamp.now() | ||
return self.runtimeClient.getNextInvocation(logger: logger).peekError { error in | ||
logger.error("could not fetch work from lambda runtime engine: \(error)") | ||
}.flatMap { invocation, event in | ||
|
@@ -77,7 +80,7 @@ extension Lambda { | |
eventLoop: self.eventLoop, | ||
allocator: self.allocator, | ||
invocation: invocation) | ||
let baggage = context.baggage | ||
self.tracer.beginSegment(name: "getNextInvocation", baggage: context.baggage, startTime: startTime).end() | ||
logger.debug("sending invocation to lambda handler \(handler)") | ||
return handler.handle(context: context, event: event) | ||
// Hopping back to "our" EventLoop is importnant in case the handler returns a future that | ||
|
@@ -89,19 +92,19 @@ extension Lambda { | |
if case .failure(let error) = result { | ||
logger.warning("lambda handler returned an error: \(error)") | ||
} | ||
return (invocation, result, baggage) | ||
return (invocation, result, context) | ||
} | ||
}.flatMap { (invocation, result, baggage: BaggageContext) in | ||
}.flatMap { (invocation, result, context: Context) in | ||
// 3. report results to runtime engine | ||
self.tracer.segment(name: "ReportResults", baggage: baggage) { _ in | ||
self.runtimeClient.reportResults(logger: logger, invocation: invocation, result: result).peekError { error in | ||
self.tracer.segment(name: "ReportResults", baggage: context.baggage) { segment in | ||
self.runtimeClient.reportResults(logger: logger, invocation: invocation, result: result, | ||
context: segment.baggage).peekError { error in | ||
logger.error("could not report results to lambda runtime engine: \(error)") | ||
} | ||
} | ||
}.flatMap { | ||
// flush the tracer after each invocation | ||
self.tracer.flush(on: self.eventLoop) | ||
} | ||
// flush the tracer after each invocation | ||
.flush(self.tracer, recover: false) | ||
} | ||
|
||
/// cancels the current run, if we are waiting for next invocation (long poll from Lambda control plane) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,8 @@ | |
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
import AWSXRayRecorder | ||
import Baggage | ||
import Logging | ||
import NIO | ||
import NIOHTTP1 | ||
|
@@ -27,16 +29,17 @@ extension Lambda { | |
private let allocator = ByteBufferAllocator() | ||
private let httpClient: HTTPClient | ||
|
||
init(eventLoop: EventLoop, configuration: Configuration.RuntimeEngine) { | ||
init(eventLoop: EventLoop, configuration: Configuration.RuntimeEngine, tracer: TracingInstrument) { | ||
self.eventLoop = eventLoop | ||
self.httpClient = HTTPClient(eventLoop: eventLoop, configuration: configuration) | ||
self.httpClient = HTTPClient(eventLoop: eventLoop, configuration: configuration, tracer: tracer) | ||
} | ||
|
||
/// Requests invocation from the control plane. | ||
func getNextInvocation(logger: Logger) -> EventLoopFuture<(Invocation, ByteBuffer)> { | ||
let url = Consts.invocationURLPrefix + Consts.getNextInvocationURLSuffix | ||
logger.debug("requesting work from lambda runtime engine using \(url)") | ||
return self.httpClient.get(url: url, headers: RuntimeClient.defaultHeaders).flatMapThrowing { response in | ||
// we will get the trace context in the invocation response | ||
return self.httpClient.get(url: url, headers: RuntimeClient.defaultHeaders, context: .init()).flatMapThrowing { response in | ||
guard response.status == .ok else { | ||
throw RuntimeError.badStatusCode(response.status) | ||
} | ||
|
@@ -58,7 +61,7 @@ extension Lambda { | |
} | ||
|
||
/// Reports a result to the Runtime Engine. | ||
func reportResults(logger: Logger, invocation: Invocation, result: Result<ByteBuffer?, Error>) -> EventLoopFuture<Void> { | ||
func reportResults(logger: Logger, invocation: Invocation, result: Result<ByteBuffer?, Error>, context: BaggageContext) -> EventLoopFuture<Void> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in some places we refer to @ktoso WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah saw more discussion to this end above. looks like this is still a bit of area of confusion so maybe good feedback for the API and naming choices. Lets see if we can come with something that is intuitive otherwise we will end up having this discussion every time There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah very much agreed... We wrote thought this through and there are documented guidelines here: https://github.com/slashmo/gsoc-swift-baggage-context#argument-namingpositioning Specifically, it (and any other context) basically always should be context; a parameter should never be called baggage; the ONLY place there can be // I'm very open to finding better names. Will try to brainstorm this some more with feedback from this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using just the name Baggage we should think some more about if that's an option... That's going to be a standard name: https://w3c.github.io/baggage/ and it matches pretty much what we are TBH, a bag of pretty structured data. BUT there is also https://www.w3.org/TR/trace-context/ and their relationship still remains messy (in the standards to be honest). BaggageContext is a term known from PivotTracing and TracingPlane and somewhat "known" in the tracing space... Open question really, but we need to be careful here :~ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. personally, naming it just |
||
var url = Consts.invocationURLPrefix + "/" + invocation.requestID | ||
var body: ByteBuffer? | ||
let headers: HTTPHeaders | ||
|
@@ -77,7 +80,7 @@ extension Lambda { | |
headers = RuntimeClient.errorHeaders | ||
} | ||
logger.debug("reporting results to lambda runtime engine using \(url)") | ||
return self.httpClient.post(url: url, headers: headers, body: body).flatMapThrowing { response in | ||
return self.httpClient.post(url: url, headers: headers, body: body, context: context).flatMapThrowing { response in | ||
guard response.status == .accepted else { | ||
throw RuntimeError.badStatusCode(response.status) | ||
} | ||
|
@@ -102,7 +105,7 @@ extension Lambda { | |
var body = self.allocator.buffer(capacity: bytes.count) | ||
body.writeBytes(bytes) | ||
logger.warning("reporting initialization error to lambda runtime engine using \(url)") | ||
return self.httpClient.post(url: url, headers: RuntimeClient.errorHeaders, body: body).flatMapThrowing { response in | ||
return self.httpClient.post(url: url, headers: RuntimeClient.errorHeaders, body: body, context: .init()).flatMapThrowing { response in | ||
guard response.status == .accepted else { | ||
throw RuntimeError.badStatusCode(response.status) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh we can start out with just a lock and change only if proven to matter a lot;
The https://blog.nelhage.com/post/rwlock-contention/ keeps being brought up when we reach for RWLocks recently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RWlock are great when you do single (or very very very little) write and all-reads. in mixed mode it can get tricky tp get good performance