-
Notifications
You must be signed in to change notification settings - Fork 113
RFC: instrument lambda handler #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3bfc54f
4fa2e4c
99e2d36
b639901
6609e51
7b42510
fed24c2
50336b6
5ee3cfe
5f9773b
ccb11da
bda0014
f3b4beb
8cc1072
03b745f
dc5c7b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,9 +12,12 @@ | |
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
import AWSXRayRecorder | ||
import Baggage | ||
import Dispatch | ||
import Logging | ||
import NIO | ||
import NIOConcurrencyHelpers | ||
|
||
// MARK: - InitializationContext | ||
|
||
|
@@ -50,6 +53,9 @@ extension Lambda { | |
/// Lambda runtime context. | ||
/// The Lambda runtime generates and passes the `Context` to the Lambda handler as an argument. | ||
public final class Context: CustomDebugStringConvertible { | ||
// TODO: use RWLock (separate PR) | ||
private let lock = Lock() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tbh we can start out with just a lock and change only if proven to matter a lot; The https://blog.nelhage.com/post/rwlock-contention/ keeps being brought up when we reach for RWLocks recently There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RWlock are great when you do single (or very very very little) write and all-reads. in mixed mode it can get tricky tp get good performance |
||
|
||
/// The request ID, which identifies the request that triggered the function invocation. | ||
public let requestID: String | ||
|
||
|
@@ -68,11 +74,23 @@ extension Lambda { | |
/// For invocations from the AWS Mobile SDK, data about the client application and device. | ||
public let clientContext: String? | ||
|
||
// TODO: or should the Lambda "runtime" context and the Baggage context be separate? | ||
private var _baggage: BaggageContext | ||
|
||
/// Baggage context. | ||
public var baggage: BaggageContext { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this need to be a public There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's currently forced to this via the Carrier's requirement, but it may well be that requirement is quite wrong and should just be |
||
get { self.lock.withLock { _baggage } } | ||
set { self.lock.withLockVoid { _baggage = newValue } } | ||
} | ||
|
||
/// `Logger` to log with | ||
/// | ||
/// - note: The `LogLevel` can be configured using the `LOG_LEVEL` environment variable. | ||
public let logger: Logger | ||
|
||
/// Tracing instrument. | ||
public let tracer: TracingInstrument | ||
pokryfka marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
/// The `EventLoop` the Lambda is executed on. Use this to schedule work with. | ||
/// This is useful when implementing the `EventLoopLambdaHandler` protocol. | ||
/// | ||
|
@@ -91,8 +109,10 @@ extension Lambda { | |
cognitoIdentity: String? = nil, | ||
clientContext: String? = nil, | ||
logger: Logger, | ||
tracer: TracingInstrument, | ||
eventLoop: EventLoop, | ||
allocator: ByteBufferAllocator) { | ||
allocator: ByteBufferAllocator) | ||
ktoso marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
self.requestID = requestID | ||
self.traceID = traceID | ||
self.invokedFunctionARN = invokedFunctionARN | ||
|
@@ -106,7 +126,12 @@ extension Lambda { | |
var logger = logger | ||
logger[metadataKey: "awsRequestID"] = .string(requestID) | ||
logger[metadataKey: "awsTraceID"] = .string(traceID) | ||
var baggage = BaggageContext() | ||
// TODO: use `swift-tracing` API, note that, regardless, we can ONLY extract X-Ray Context | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not really; "we" don't decide what we extract, the configuration of instruments decides that, and yes since xray would be configured it'd extract it's own context here. Specifically:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
the problem here is that only the X-Ray trace context is provided by Lambda Runtime API (in header), see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html#runtimes-api-next Context for other instruments may be provided in invocation payload and needs to be extracted by user in lambda event handler they implement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, I missed that bit of the Lambda design. So in this integration style, even if one triggered the lambda via an http request, it would not get “the http request” but just the body, and the headers are the ones as listed on there, including the XRay trace header etc. There AFAIR exists an integration mode though to get the entire request, right?
Is this something that the runtime currently is able to handle? Seems more to be about how the API Gateway is configured right? Though I’ve not had the time to dig deeper into this yet. You’d probably know more about this @fabianfett, we can catch up about this today maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess you are referring to API Gateway Lambda proxy integration This does NOT affect Lambda custom runtime API, it affects API Gateway v1 "REST API" routing which, if configured that way, does not try to route events based on a RESTful model, instead it forwards all events to lambda which needs to resolve HTTP method, path and arguments itself based on the content in the event payload (but it still remains in the event payload -> invocation payload):
Note that:
For reference Integrating AWS X-Ray with other AWS services There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for explaining this in more depth @pokryfka, I need to read up some more here about aws/lambda in general it seems. |
||
baggage.xRayContext = try? XRayContext(tracingHeader: traceID) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would we want to let the users decide what tracer to use, or since this is AWS oriented anyways just pin to x-ray? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that's the main TODO in this PR - not using the xray explicitly. Even if it is bound to xRay in reality, we should make sure to only use the abstract API, maybe maybe some day there would be some other tracer or maybe amazon decide to make their own or something, no idea, but let's keep the door open for future evolution. Using the tracing API also means that while developing locally you could plug in the Instruments(.app) (naming gets confusing...) tracer: slashmo/gsoc-swift-tracing#97 and see spans in Instruments on the mac. Instruments does not really understand / visualize "traces" with parents etc well today... but it's something we can keep in mind, maybe it'll get better at displaying those and then when developing locally you get the same user experience with tracing as on prod :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ktoso I generally agree with you. I'm however a little concerned, that tracing will require manual adjustment (incl. adding another dependency) if we don't include the XRay tracing by default. Lambda tracing wouldn't work out of the box in this case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I get that -- though the lambda runtime "core" should not be instrumented using any specific tracer, regardless of "yes it'll be xray" but maybe some day down the road there's other impls, and you'd want to swap it. We could absolutely though make some "batteries included" package, we should think how to pull that off, wdyt? |
||
self._baggage = baggage | ||
self.logger = logger | ||
self.tracer = tracer | ||
} | ||
|
||
public func getRemainingTime() -> TimeAmount { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -129,18 +129,31 @@ public protocol EventLoopLambdaHandler: ByteBufferLambdaHandler { | |
public extension EventLoopLambdaHandler { | ||
/// Driver for `ByteBuffer` -> `In` decoding and `Out` -> `ByteBuffer` encoding | ||
func handle(context: Lambda.Context, event: ByteBuffer) -> EventLoopFuture<ByteBuffer?> { | ||
switch self.decodeIn(buffer: event) { | ||
let segment = context.tracer.beginSegment(name: "HandleEvent", baggage: context.baggage) | ||
let decodedEvent = segment.subsegment(name: "DecodeIn") { _ in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. btw, I've been wondering if we should offer This is quite common I think so I think we could add it... We could also do the same with a NIO extensions package then to handle Future returning blocks 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ktoso +1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do use "helpers" both with closures and NIO futures as referenced here slashmo/gsoc-swift-tracing#125 (comment) |
||
self.decodeIn(buffer: event) | ||
} | ||
switch decodedEvent { | ||
case .failure(let error): | ||
segment.addError(error) | ||
segment.end() | ||
return context.eventLoop.makeFailedFuture(CodecError.requestDecoding(error)) | ||
case .success(let `in`): | ||
return self.handle(context: context, event: `in`).flatMapThrowing { out in | ||
switch self.encodeOut(allocator: context.allocator, value: out) { | ||
case .failure(let error): | ||
throw CodecError.responseEncoding(error) | ||
case .success(let buffer): | ||
return buffer | ||
let subsegment = segment.beginSubsegment(name: "HandleIn") | ||
context.baggage = subsegment.baggage | ||
return self.handle(context: context, event: `in`) | ||
.endSegment(subsegment) | ||
.flatMapThrowing { out in | ||
try context.tracer.segment(name: "EncodeOut", baggage: segment.baggage) { _ in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Naming nitpick: I really would like to get all libs to be consistent with the use of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this will be "context" when using var span = tracer.startSpan(named: "EncodeOut", context: segment.baggage) I dont have strong opinion on that, but to avoid confusion in XRaySDK I use "context" for var baggage = BaggageContext() // empty
let context = XRayContext()
baggage.xRayContext = context
let segment = tracer.beginSegment(name: "EncodeOut", baggage: baggage) // may report missing context
// or
let segment2 = tracer.beginSegment(name: "EncodeOut", context: context) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would you say about the following spelling though:
also because one can use the BaggageContextCarrier to assign through into the underlying baggage context;
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see the point with
though; but that's the segment API, you are free to do what you want there but still I would not recommend using baggage as parameter names, you'd want to accept a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made a ticket to discuss naming once more: slashmo/gsoc-swift-baggage-context#23 |
||
switch self.encodeOut(allocator: context.allocator, value: out) { | ||
case .failure(let error): | ||
throw CodecError.responseEncoding(error) | ||
case .success(let buffer): | ||
return buffer | ||
} | ||
} | ||
} | ||
} | ||
.endSegment(segment) | ||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move this to L129
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomerd
the problem is that the current implementation in
tracer.shutdown
:EventLoop
),EventLoop
).this does not work well if lifecycle and tracer share the
EventLoop
:Not sure what the best solution is, I am thinking to remove flushing operation from shutdown,
that way client could always flush on the loop and then shutdown would just close all resources without trying to make any operation on the
EventLoop
.@ktoso
TracingInstrument
defined inswift-tracing
only defines interface to sync flush.for reference types in
AWSXRayRecorder
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat of a layering question.
swift-tracing cannot know about nio, so nothing about ELs or similar.
We could expose a
shutdown(callback)
if needed; question being, should it also flush, does flush need a callback as well then? Flushing is best-effort today and simply a signal to the tracer to try to flush.The tracing instruments do not define shutdown because, similar like metrics, it is kind of assumed you're managing its lifecycle. We thought that normally you'd likely hook into
swift-lifecycle
with your tracer, and that'd be a specifc type there so you can do whatever you want.Open to ideas here, what should we co in the API layer to help?
Would it help if we offered hooks that can interop with swift-lifecycle style callbacks nicely?
This is again one of those examples which highlight the need for swift-server/swift-service-lifecycle#11 because we could express it as:
So that'd be nice.
OR, do we need a
shutdown(callback)
on all instruments? We've so far avoided that because neither does swift-metrics deal with this and it just says whoever started things needs to close them, so we've taken the same road.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktoso
this is related but could be handled outside of
TracingInstrument
API IMOmy problem is more with flushing:
EventLoop
EventLoop
if possible, this is already implemented and worksXRayRecoder
provides a method to flush on provided loop which allows to "async flush" but them hop in within invocationTracingInstrument
only defined syncforceFlush
- which I provided as ~flush(eventLoop).wait()
; this will work but not if EventLoop is sharedTL;DR I want to change API to
TracingInstrument
as soon as its release.I cannot use its
forceFlush
if XRaySDK shares event loop with lambda runtime (which it should...)Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue forceFlush() is a signal not a blocking function. don't wait on it, do your best, and your shutdown can "wait" until flushes are complete (because it can have either a callback version, or just be fully blocking whichever works). Wouldn't this solve the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could implement
forceFlush()
that way.This would mean, however, that flashing could be finished after lambda returns results (because
TracingInstrument
would not provide API to guarantee that instrument was flushed).I do not know how gentle AWS is when scaling down lambda instances and if we can assume that
shutdown
would be called at all. (@fabianfett do you know about it?)This would probably work most of the time for lambda + xray as flushing of XRay is comparatively cheap: UDP, local network, no DNS;
Flushing of another tracer is going to be much more expensive (and it still should work, even if not practical, right?).