Description
Expected Behaviour
Today, for an unknown reason in one of our Lambdas in Production, the Tracer
object was unable to retrieve the current Segment
from X-Ray.
Because the Segment
was attempted to be retrieved in the middy middleware at the beginning of every request (here) it caused every Lambda request to fail until we forced a cold-start of the Lambda, which resolved our problem.
It caused the request to fail because the Tracer
throws an error if the Segment
returned is undefined
.
Unfortunately our Lambda was kept warm for over an hour before we tracked it down and forced a cold-start. In the meantime every request failed (see execution logs below).
My expected behaviour would be that if a Segment could not be retrieved, that it doesn't cause the whole Lambda invocation to fail, which it currently does when using the middy
middleware.
Tracing should be done on a best-effort basis, if there is any problem tracing a request it should not cause the main request itself to fail. This particular scenario of getting the Segment
is one specific case that could be improved to avoid failing the request, but it's quite possible there may be other code-paths that are vulnerable to causing the main request to fail, particularly if instrumented by the captureLambdaHandler middy middleware.
Current Behaviour
As mentioned above, the Tracer
currently throws an error if the Segment
returned is undefined
, this causes the middy middleware to fail every request, since getSegment()
is called in the captureLambdaHandlerBefore
function (here).
Code snippet
You could wire in a dummy Tracing provider to return undefined
from its getSegment()
method, called here.
Steps to Reproduce
We don't know yet what caused aws-xray-sdk-core
's getSegment()
(here) to return undefined
, but if you can simulate that, and use the captureLambdaHandler
middy middleware, you will find that it causes the wrapped request to fail.
For us it caused the request to fail until the function was cold-started since getSegment()
never started returning a Segment
object.
Possible Solution
My suggestion would be to consider updating the Tracer.getSegment() method to log a warning and return a dummy SubSegment
if a real segment couldn't be retrieved, similar to how it already does if tracing is disabled.
Something like as follows:
public getSegment(): Segment | Subsegment {
if (!this.isTracingEnabled()) {
return new Subsegment('## Dummy segment');
}
const segment = this.provider.getSegment();
if (segment == null) { // changed from === undefined to be slightly more robust in case null is returned somehow
// throw new Error('Failed to get the current sub/segment from the context.');
console.warn('Failed to get the current sub/segment from the context, so failing over to manually create one to allow the request to proceed.');
return new Subsegment('## Fallback segment (as unable to retrieve existing Segment)');
}
return segment;
}
AWS Lambda Powertools for TypeScript version
1.6.0
AWS Lambda function runtime
18.x
Packaging format used
Lambda Layers
Execution logs
2023-03-07T17:08:23.043Z xxxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx ERROR Error: Failed to get the current sub/segment from the context.
at Object.contextMissingLogError [as contextMissing] (/var/task/index.js:15505:23)
at getSegment (/var/task/index.js:15569:49)
at ProviderService.getSegment (/var/task/index.js:20115:51)
at Tracer2.getSegment (/var/task/index.js:20539:39)
at open (/var/task/index.js:20841:32)
at captureLambdaHandlerBefore (/var/task/index.js:20852:11)
at runMiddlewares (/var/task/index.js:24918:23)
at async runRequest (/var/task/index.js:24876:5)
2023-03-07T17:08:23.043Z xxxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx ERROR Error: Failed to get the current sub/segment from the context.
at Object.contextMissingLogError [as contextMissing] (/var/task/index.js:15505:23)
at getSegment (/var/task/index.js:15569:49)
at ProviderService.getSegment (/var/task/index.js:20115:51)
at Tracer2.getSegment (/var/task/index.js:20539:39)
at Tracer2.addErrorAsMetadata (/var/task/index.js:20209:33)
at captureLambdaHandlerError (/var/task/index.js:20867:18)
at runMiddlewares (/var/task/index.js:24918:23)
at runRequest (/var/task/index.js:24902:13)
2023-03-07T17:08:23.044Z xxxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx ERROR Invoke Error {
"errorType": "Error",
"errorMessage": "Failed to get the current sub/segment from the context.",
"originalError": {
"errorType": "Error",
"errorMessage": "Failed to get the current sub/segment from the context.",
"stack": [
"Error: Failed to get the current sub/segment from the context.",
" at Tracer2.getSegment (/var/task/index.js:20541:17)",
" at open (/var/task/index.js:20841:32)",
" at captureLambdaHandlerBefore (/var/task/index.js:20852:11)",
" at runMiddlewares (/var/task/index.js:24918:23)",
" at async runRequest (/var/task/index.js:24876:5)"
]
},
"stack": [
"Error: Failed to get the current sub/segment from the context.",
" at Tracer2.getSegment (/var/task/index.js:20541:17)",
" at Tracer2.addErrorAsMetadata (/var/task/index.js:20209:33)",
" at captureLambdaHandlerError (/var/task/index.js:20867:18)",
" at runMiddlewares (/var/task/index.js:24918:23)",
" at runRequest (/var/task/index.js:24902:13)"
]
}