Oversampling Mitigation #366

carolabadeer · 2022-11-08T00:50:25Z

Issue #, if available: N/A

Description of changes: Implemented oversampling mitigation to support creating subsegments without sampling (control sampling decisions at the subsegment level) and an SQS message helper class.

Note: Since it is possible to modify the sampled flag of a subsegment without the need to create a new API, this is the design implemented in the changes below. For instance, to create a new unsampled subsegment and add it to a Lambda facade segment:

    segment = FacadeSegment('facade_segment', 'id', 'id', True)
    subsegment = Subsegment('sampled_subsegment', 'local', segment)
    subsegment.sampled = False
    segment.add_subsegment(subsegment)

Similarly, the begin_subsegment API has been modified to include a default sampling parameter which defaults to True, and can be set to False when creating a new unsampled subsegment.

    segment = xray_recorder.begin_segment('segment')
    subsegment = xray_recorder.begin_subsegment('unsampled_subsegment', sampling=False)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

carolabadeer · 2022-11-08T00:53:42Z

This change has been tested in a Lambda environment using the following Lambda function:

from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core.models.subsegment import Subsegment
from aws_xray_sdk.core.utils.sqs_message_helper import SqsMessageHelper
import logging 

def lambda_handler(event, context):
    
    for message in event['Records']:
        if SqsMessageHelper.isSampled(message):
            print('sampled - doing batch work')
            subsegment = xray_recorder.begin_subsegment('sampled_subsegment', sampling=True)

        else:
            print('unsampled - doing batch work')
            subsegment = xray_recorder.begin_subsegment('unsampled_subsegment', sampling=False)
    
    xray_recorder.end_subsegment()

X-Ray service map when Sampled=1 in AWSTraceHeader:

X-Ray service map when Sampled=0 in AWSTraceHeader:

srprash · 2022-11-08T03:32:22Z

aws_xray_sdk/core/models/entity.py

@@ -81,6 +81,7 @@ def add_subsegment(self, subsegment):
        """
        self._check_ended()
        subsegment.parent_id = self.id
+        subsegment.sampled = self.sampled if subsegment.sampled == True else False


Why are we doing parent based sampling if the subsegment itself is sampled? Do we need this line at all?

This is specifically for the case when a subsegment is added to an unsampled subsegment. We have to ensure that once an unsampled subsegment is created, all of its child subsegments are also unsampled.

So for example, it is possible to do this:

segment = Segment('parent_segment', 'id', 'id', True) subsegment = Subsegment('subsegment1', 'local', segment) subsegment.sampled = False segment.add_subsegment(subsegment) subsegment2 = Subsegment('subsegment2', 'local', segment) subsegment.add_subsegment(subsegment2)

In this case, it would be illegal for subsegment2 to be sampled, so it has to take the decision of its direct parent subsegment - subsegment1- which is unsampled.

Does it make more sense to move this assignment to the "begin_subsegment" function in recorder.py and just base it off of the parent there? Then we can just base it off of the entity here "current_entity = self.get_trace_entity()". That entity would be the direct parent no?

Having to check two levels like that is a code smell to me.

There are two different ways to create subsegments, which is why this had to be done on two different levels - @srprash please correct me if I'm missing something

srprash · 2022-11-08T03:41:06Z

aws_xray_sdk/core/recorder.py

+        :param bool sampling: sampling decision for the subsegment being created,
+            defaults to True


This sampling parameter is not the same in meaning with the sampled flag on an entity.
sampling=True here means that the subsegment will inherit the sampling decision of its parent entity, and could either be sampled or not.
sampling=False would mean that the subsegment will not inherit the sampling decision and will be unsampled.

I hope I was able to explain the difference. Please reword the description. :)

srprash · 2022-11-08T03:41:49Z

aws_xray_sdk/core/recorder.py

@@ -335,7 +340,7 @@ def end_subsegment(self, end_time=None):

        # if segment is already close, we check if we can send entire segment
        # otherwise we check if we need to stream some subsegments
-        if self.current_segment().ready_to_send():
+        if self.current_segment().ready_to_send(): 


nit: unnecessary white space.

srprash · 2022-11-08T03:48:44Z

aws_xray_sdk/core/utils/sqs_message_helper.py

+    def isSampled(sqs_message):
+        attributes = sqs_message['attributes']
+
+        if 'AWSTraceHeader' not in attributes:


I suggest making AWSTraceHeader into a constant and reuse it in both the places in this method.

srprash · 2022-11-08T03:49:05Z

aws_xray_sdk/ext/util.py

@@ -35,7 +35,7 @@ def inject_trace_header(headers, entity):
    else:
        header = entity.get_origin_trace_header()
    data = header.data if header else None
-
+    


nit: unnecessary white spaces

atshaw43 · 2022-11-08T17:38:20Z

Can we also get a unit test for the emitter?

srprash · 2022-11-08T17:38:28Z

@carolabadeer I know we decided on this approach in our conversation. But now when I think of it again, I agree with @atshaw43 that we should keep the SDKs consistent. There could be confusion among customers why the a begin_subsegment_without_sampling API exists in other SDKs but not in this.

atshaw43 · 2022-11-08T17:31:23Z

aws_xray_sdk/core/recorder.py

-        if not segment.sampled:
+        current_entity = self.get_trace_entity()
+
+        if not current_entity.sampled or not sampling:


Would looking at just current_entity.sampled be enough here?

I don't think so. If the current entity (segment or subsegment) is sampled and a new unsampled subsegment is being added, it is necessary to check the sampling flag being set on the newly created subsegment.

carolabadeer · 2022-11-08T22:22:37Z

Updated lambda handler function with the new begin_subsegment_without_sampling API:

def lambda_handler(event, context):

    for message in event['Records']:
        if SqsMessageHelper.isSampled(message):
            print('sampled - doing batch work')
            subsegment = xray_recorder.begin_subsegment('sampled_subsegment')

        else:
            print('unsampled - doing batch work')
            subsegment = xray_recorder.begin_subsegment_without_sampling('unsampled_subsegment')
    
    xray_recorder.end_subsegment()

srprash · 2022-11-09T21:45:36Z

aws_xray_sdk/core/recorder.py

            subsegment = DummySubsegment(segment, name)
+            subsegment.sampled = False


This code is redundant. A DummySubsegment is already unsampled.
https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/models/dummy_entities.py#L96

srprash · 2022-11-09T21:49:34Z

aws_xray_sdk/core/recorder.py

        else:
            subsegment = Subsegment(name, namespace, segment)

        self.context.put_subsegment(subsegment)

        return subsegment

+    def begin_subsegment_without_sampling(self, name):


There's lot of duplicate code here with begin_subsegment. Can you move the common code to another function and reuse?

srprash · 2022-11-09T21:50:15Z

aws_xray_sdk/core/recorder.py

+            return None
+
+        subsegment = DummySubsegment(segment, name)
+        subsegment.sampled = False


Need not set sampled = False on a DummySubsegment.

srprash · 2022-11-10T15:54:12Z

aws_xray_sdk/core/recorder.py

+        '''
+        Helper method to begin_subsegment and begin_subsegment_without_sampling
+        '''
+        result = True


I feel this is still problematic. I'm not sure I get the point of using result and returning it.
Also, result seem to be of variable types. Here it is a boolean but later it can be a Subsegment. This can easily confusion and bugs.

I believe there is a better way to leverage this helper method than it's callers having to check its return types and values.
_begin_subsegment_helper can accept a parameter as flag for sampling or not. The callers can simply call this method with the flag and get back a Subsegment depending on the flag's value. How does that sound?
The method definition could look like:

def _begin_subsegment_helper(name, namespace='local', beginWithoutSampling=False)

srprash

Thanks!!

implemented oversampling mitigation

d5cd83b

carolabadeer requested a review from a team as a code owner November 8, 2022 00:50

updated begin_subsegment docstring to include sampling parameter

3042958

srprash reviewed Nov 8, 2022

View reviewed changes

atshaw43 reviewed Nov 8, 2022

View reviewed changes

atshaw43 closed this Nov 8, 2022

carolabadeer reopened this Nov 8, 2022

created separate API for adding subsegments without sampling

4aca447

carolabadeer added 2 commits November 9, 2022 10:12

Modified add_subsegment method to log warning for orphaned subsegments

4d4b9b5

updated unit tests

8e4ed46

srprash reviewed Nov 9, 2022

View reviewed changes

addressing feedback

a99fc6d

atshaw43 approved these changes Nov 10, 2022

View reviewed changes

srprash reviewed Nov 10, 2022

View reviewed changes

carolabadeer added 3 commits November 10, 2022 09:21

final design changes

05503a8

remove default namespace value

645fb06

minor fix

91fa8c5

srprash approved these changes Nov 10, 2022

View reviewed changes

carolabadeer merged commit 6b64982 into aws:master Nov 10, 2022

		:param bool sampling: sampling decision for the subsegment being created,
		defaults to True

		subsegment = DummySubsegment(segment, name)
		subsegment.sampled = False

Oversampling Mitigation #366

Oversampling Mitigation #366

Uh oh!

Conversation

carolabadeer commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carolabadeer commented Nov 8, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carolabadeer Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atshaw43 commented Nov 8, 2022

Uh oh!

srprash commented Nov 8, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carolabadeer commented Nov 8, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srprash left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

carolabadeer commented Nov 8, 2022 •

edited

Loading

carolabadeer Nov 8, 2022 •

edited

Loading