-
Notifications
You must be signed in to change notification settings - Fork 113
RFC: instrument lambda handler #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 14 commits
3bfc54f
4fa2e4c
99e2d36
b639901
6609e51
7b42510
fed24c2
50336b6
5ee3cfe
5f9773b
ccb11da
bda0014
f3b4beb
8cc1072
03b745f
dc5c7b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,9 +12,12 @@ | |
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
import AWSXRayRecorder | ||
import Baggage | ||
import Dispatch | ||
import Logging | ||
import NIO | ||
import NIOConcurrencyHelpers | ||
|
||
// MARK: - InitializationContext | ||
|
||
|
@@ -50,6 +53,9 @@ extension Lambda { | |
/// Lambda runtime context. | ||
/// The Lambda runtime generates and passes the `Context` to the Lambda handler as an argument. | ||
public final class Context: CustomDebugStringConvertible { | ||
// TODO: use RWLock (separate PR) | ||
private let lock = Lock() | ||
|
||
/// The request ID, which identifies the request that triggered the function invocation. | ||
public let requestID: String | ||
|
||
|
@@ -68,11 +74,23 @@ extension Lambda { | |
/// For invocations from the AWS Mobile SDK, data about the client application and device. | ||
public let clientContext: String? | ||
|
||
// TODO: or should the Lambda "runtime" context and the Baggage context be separate? | ||
private var _baggage: BaggageContext | ||
|
||
/// Baggage context. | ||
public var baggage: BaggageContext { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this need to be a public There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's currently forced to this via the Carrier's requirement, but it may well be that requirement is quite wrong and should just be |
||
get { self.lock.withLock { _baggage } } | ||
set { self.lock.withLockVoid { _baggage = newValue } } | ||
} | ||
|
||
/// `Logger` to log with | ||
/// | ||
/// - note: The `LogLevel` can be configured using the `LOG_LEVEL` environment variable. | ||
public let logger: Logger | ||
|
||
/// Tracing instrument. | ||
public let tracer: TracingInstrument | ||
pokryfka marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
/// The `EventLoop` the Lambda is executed on. Use this to schedule work with. | ||
/// This is useful when implementing the `EventLoopLambdaHandler` protocol. | ||
/// | ||
|
@@ -91,8 +109,10 @@ extension Lambda { | |
cognitoIdentity: String? = nil, | ||
clientContext: String? = nil, | ||
logger: Logger, | ||
tracer: TracingInstrument, | ||
eventLoop: EventLoop, | ||
allocator: ByteBufferAllocator) { | ||
allocator: ByteBufferAllocator) | ||
ktoso marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
self.requestID = requestID | ||
self.traceID = traceID | ||
self.invokedFunctionARN = invokedFunctionARN | ||
|
@@ -106,7 +126,12 @@ extension Lambda { | |
var logger = logger | ||
logger[metadataKey: "awsRequestID"] = .string(requestID) | ||
logger[metadataKey: "awsTraceID"] = .string(traceID) | ||
var baggage = BaggageContext() | ||
// TODO: use `swift-tracing` API, note that, regardless, we can ONLY extract X-Ray Context | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not really; "we" don't decide what we extract, the configuration of instruments decides that, and yes since xray would be configured it'd extract it's own context here. Specifically:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
the problem here is that only the X-Ray trace context is provided by Lambda Runtime API (in header), see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html#runtimes-api-next Context for other instruments may be provided in invocation payload and needs to be extracted by user in lambda event handler they implement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, I missed that bit of the Lambda design. So in this integration style, even if one triggered the lambda via an http request, it would not get “the http request” but just the body, and the headers are the ones as listed on there, including the XRay trace header etc. There AFAIR exists an integration mode though to get the entire request, right?
Is this something that the runtime currently is able to handle? Seems more to be about how the API Gateway is configured right? Though I’ve not had the time to dig deeper into this yet. You’d probably know more about this @fabianfett, we can catch up about this today maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess you are referring to API Gateway Lambda proxy integration This does NOT affect Lambda custom runtime API, it affects API Gateway v1 "REST API" routing which, if configured that way, does not try to route events based on a RESTful model, instead it forwards all events to lambda which needs to resolve HTTP method, path and arguments itself based on the content in the event payload (but it still remains in the event payload -> invocation payload):
Note that:
For reference Integrating AWS X-Ray with other AWS services There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for explaining this in more depth @pokryfka, I need to read up some more here about aws/lambda in general it seems. |
||
baggage.xRayContext = try? XRayContext(tracingHeader: traceID) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would we want to let the users decide what tracer to use, or since this is AWS oriented anyways just pin to x-ray? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that's the main TODO in this PR - not using the xray explicitly. Even if it is bound to xRay in reality, we should make sure to only use the abstract API, maybe maybe some day there would be some other tracer or maybe amazon decide to make their own or something, no idea, but let's keep the door open for future evolution. Using the tracing API also means that while developing locally you could plug in the Instruments(.app) (naming gets confusing...) tracer: slashmo/gsoc-swift-tracing#97 and see spans in Instruments on the mac. Instruments does not really understand / visualize "traces" with parents etc well today... but it's something we can keep in mind, maybe it'll get better at displaying those and then when developing locally you get the same user experience with tracing as on prod :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ktoso I generally agree with you. I'm however a little concerned, that tracing will require manual adjustment (incl. adding another dependency) if we don't include the XRay tracing by default. Lambda tracing wouldn't work out of the box in this case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I get that -- though the lambda runtime "core" should not be instrumented using any specific tracer, regardless of "yes it'll be xray" but maybe some day down the road there's other impls, and you'd want to swap it. We could absolutely though make some "batteries included" package, we should think how to pull that off, wdyt? |
||
self._baggage = baggage | ||
self.logger = logger | ||
self.tracer = tracer | ||
} | ||
|
||
public func getRemainingTime() -> TimeAmount { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -129,18 +129,31 @@ public protocol EventLoopLambdaHandler: ByteBufferLambdaHandler { | |
public extension EventLoopLambdaHandler { | ||
/// Driver for `ByteBuffer` -> `In` decoding and `Out` -> `ByteBuffer` encoding | ||
func handle(context: Lambda.Context, event: ByteBuffer) -> EventLoopFuture<ByteBuffer?> { | ||
switch self.decodeIn(buffer: event) { | ||
let segment = context.tracer.beginSegment(name: "HandleEvent", baggage: context.baggage) | ||
let decodedEvent = segment.subsegment(name: "DecodeIn") { _ in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. btw, I've been wondering if we should offer This is quite common I think so I think we could add it... We could also do the same with a NIO extensions package then to handle Future returning blocks 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ktoso +1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do use "helpers" both with closures and NIO futures as referenced here slashmo/gsoc-swift-tracing#125 (comment) |
||
self.decodeIn(buffer: event) | ||
} | ||
switch decodedEvent { | ||
case .failure(let error): | ||
segment.addError(error) | ||
segment.end() | ||
return context.eventLoop.makeFailedFuture(CodecError.requestDecoding(error)) | ||
case .success(let `in`): | ||
return self.handle(context: context, event: `in`).flatMapThrowing { out in | ||
switch self.encodeOut(allocator: context.allocator, value: out) { | ||
case .failure(let error): | ||
throw CodecError.responseEncoding(error) | ||
case .success(let buffer): | ||
return buffer | ||
let subsegment = segment.beginSubsegment(name: "HandleIn") | ||
context.baggage = subsegment.baggage | ||
return self.handle(context: context, event: `in`) | ||
.endSegment(subsegment) | ||
.flatMapThrowing { out in | ||
try context.tracer.segment(name: "EncodeOut", baggage: segment.baggage) { _ in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Naming nitpick: I really would like to get all libs to be consistent with the use of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this will be "context" when using var span = tracer.startSpan(named: "EncodeOut", context: segment.baggage) I dont have strong opinion on that, but to avoid confusion in XRaySDK I use "context" for var baggage = BaggageContext() // empty
let context = XRayContext()
baggage.xRayContext = context
let segment = tracer.beginSegment(name: "EncodeOut", baggage: baggage) // may report missing context
// or
let segment2 = tracer.beginSegment(name: "EncodeOut", context: context) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would you say about the following spelling though:
also because one can use the BaggageContextCarrier to assign through into the underlying baggage context;
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see the point with
though; but that's the segment API, you are free to do what you want there but still I would not recommend using baggage as parameter names, you'd want to accept a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made a ticket to discuss naming once more: slashmo/gsoc-swift-baggage-context#23 |
||
switch self.encodeOut(allocator: context.allocator, value: out) { | ||
case .failure(let error): | ||
throw CodecError.responseEncoding(error) | ||
case .success(let buffer): | ||
return buffer | ||
} | ||
} | ||
} | ||
} | ||
.endSegment(segment) | ||
} | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,22 +12,30 @@ | |
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
import AWSXRaySDK // TODO: use swift-tracing when available | ||
import Baggage | ||
import Dispatch | ||
import Logging | ||
import NIO | ||
|
||
// type names defined in `TracingInstrument`, aliases will be removed | ||
public typealias TracingInstrument = XRayRecorder | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see, that confused me for a moment why it's an instrument but used segment() ;) Okey for the PoC, agreed on migrating once "released" though it would be good to use this PR to already migrate and see that the API supports everything one needs here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should make the API transition easier, temporary ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be possible to PoC this out using the real swift-tracing API right away though? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be pretty much straight forward as far API is concerned. I think the missing part is to freeze and release i am trying to keep |
||
public typealias NoOpTracingInstrument = XRayNoOpRecorder | ||
|
||
extension Lambda { | ||
/// LambdaRunner manages the Lambda runtime workflow, or business logic. | ||
internal final class Runner { | ||
private let runtimeClient: RuntimeClient | ||
private let tracer: TracingInstrument | ||
private let eventLoop: EventLoop | ||
private let allocator: ByteBufferAllocator | ||
|
||
private var isGettingNextInvocation = false | ||
|
||
init(eventLoop: EventLoop, configuration: Configuration) { | ||
init(eventLoop: EventLoop, configuration: Configuration, tracer: TracingInstrument) { | ||
self.eventLoop = eventLoop | ||
self.runtimeClient = RuntimeClient(eventLoop: self.eventLoop, configuration: configuration.runtimeEngine) | ||
self.runtimeClient = RuntimeClient(eventLoop: self.eventLoop, configuration: configuration.runtimeEngine, tracer: tracer) | ||
self.tracer = tracer | ||
self.allocator = ByteBufferAllocator() | ||
} | ||
|
||
|
@@ -60,15 +68,19 @@ extension Lambda { | |
logger.debug("lambda invocation sequence starting") | ||
// 1. request invocation from lambda runtime engine | ||
self.isGettingNextInvocation = true | ||
// we will get the trace context in the invocation | ||
let startTime = XRayRecorder.Timestamp.now() | ||
return self.runtimeClient.getNextInvocation(logger: logger).peekError { error in | ||
logger.error("could not fetch work from lambda runtime engine: \(error)") | ||
}.flatMap { invocation, event in | ||
// 2. send invocation to handler | ||
self.isGettingNextInvocation = false | ||
let context = Context(logger: logger, | ||
tracer: self.tracer, | ||
eventLoop: self.eventLoop, | ||
allocator: self.allocator, | ||
invocation: invocation) | ||
self.tracer.beginSegment(name: "getNextInvocation", baggage: context.baggage, startTime: startTime).end() | ||
logger.debug("sending invocation to lambda handler \(handler)") | ||
return handler.handle(context: context, event: event) | ||
// Hopping back to "our" EventLoop is importnant in case the handler returns a future that | ||
|
@@ -80,14 +92,19 @@ extension Lambda { | |
if case .failure(let error) = result { | ||
logger.warning("lambda handler returned an error: \(error)") | ||
} | ||
return (invocation, result) | ||
return (invocation, result, context) | ||
} | ||
}.flatMap { invocation, result in | ||
}.flatMap { (invocation, result, context: Context) in | ||
// 3. report results to runtime engine | ||
self.runtimeClient.reportResults(logger: logger, invocation: invocation, result: result).peekError { error in | ||
logger.error("could not report results to lambda runtime engine: \(error)") | ||
self.tracer.segment(name: "ReportResults", baggage: context.baggage) { segment in | ||
self.runtimeClient.reportResults(logger: logger, invocation: invocation, result: result, | ||
context: segment.baggage).peekError { error in | ||
logger.error("could not report results to lambda runtime engine: \(error)") | ||
} | ||
} | ||
} | ||
// flush the tracer after each invocation | ||
.flush(self.tracer, recover: false) | ||
} | ||
|
||
/// cancels the current run, if we are waiting for next invocation (long poll from Lambda control plane) | ||
|
@@ -101,14 +118,15 @@ extension Lambda { | |
} | ||
|
||
private extension Lambda.Context { | ||
convenience init(logger: Logger, eventLoop: EventLoop, allocator: ByteBufferAllocator, invocation: Lambda.Invocation) { | ||
convenience init(logger: Logger, tracer: TracingInstrument, eventLoop: EventLoop, allocator: ByteBufferAllocator, invocation: Lambda.Invocation) { | ||
self.init(requestID: invocation.requestID, | ||
traceID: invocation.traceID, | ||
invokedFunctionARN: invocation.invokedFunctionARN, | ||
deadline: DispatchWallTime(millisSinceEpoch: invocation.deadlineInMillisSinceEpoch), | ||
cognitoIdentity: invocation.cognitoIdentity, | ||
clientContext: invocation.clientContext, | ||
logger: logger, | ||
tracer: tracer, | ||
eventLoop: eventLoop, | ||
allocator: allocator) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh we can start out with just a lock and change only if proven to matter a lot;
The https://blog.nelhage.com/post/rwlock-contention/ keeps being brought up when we reach for RWLocks recently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RWlock are great when you do single (or very very very little) write and all-reads. in mixed mode it can get tricky tp get good performance