Skip to content

Add ServiceCallDuration to ApiCallAttempt metrics even for Http timeouts #2486

Open
@swaranga

Description

@swaranga

Today, if a call encounters Http timeout, the metric does not include how much time was spent in that particular attempt. Given that SDK metrics are really to debug such issues, it will be very helpful if the ApiCallAttempt metric collection included the ServiceCallDuration metric.

I have reproduced this issue the following way:

A simple program to make an SDK call with a very small timeout guaranteed to throw before the call can finish:

KinesisClient kc = ...;
ks.listStreams(
  ListStreamsRequest.builder()
    .overrideConfiguration(c -> c..apiCallAttemptTimeout(Duration.ofMillis(5))) // almost guaranteed to timeout
    .build()
);

Running this code logs the following metrics:

MetricCollection(
  name=ApiCall, 
  metrics=[
    MetricRecord(metric=MarshallingDuration, value=PT0.04227928S), 
    MetricRecord(metric=RetryCount, value=3), 
    MetricRecord(metric=ApiCallSuccessful, value=false), 
    MetricRecord(metric=OperationName, value=ListStreams), 
    MetricRecord(metric=ApiCallDuration, value=PT0.754417876S), 
    MetricRecord(metric=CredentialsFetchDuration, value=PT1.542674972S), 
    MetricRecord(metric=ServiceId, value=Kinesis)
  ], 
  children=[
    MetricCollection(
      name=ApiCallAttempt, 
      metrics=[
        MetricRecord(metric=BackoffDelayDuration, value=PT0S), 
        MetricRecord(metric=SigningDuration, value=PT0.019085065S)
      ], 
      children=[
        MetricCollection(
          name=HttpClient, 
          metrics=[
            MetricRecord(metric=HttpClientName, value=Apache)
          ], 
          children=[]
        )
      ]
    ), 
    MetricCollection(
      name=ApiCallAttempt, 
      metrics=[
        MetricRecord(metric=BackoffDelayDuration, value=PT0.091S), 
        MetricRecord(metric=SigningDuration, value=PT0.001483931S)
      ], 
      children=[
        MetricCollection(
          name=HttpClient, 
          metrics=[
            MetricRecord(metric=HttpClientName, value=Apache)
          ], 
          children=[]
        )
      ]
    ), 
    MetricCollection(
      name=ApiCallAttempt, 
      metrics=[
        MetricRecord(metric=BackoffDelayDuration, value=PT0.172S), 
        MetricRecord(metric=SigningDuration, value=PT0.001785582S)
      ], 
      children=[
        MetricCollection(
          name=HttpClient, 
          metrics=[
            MetricRecord(metric=HttpClientName, value=Apache)
          ], 
          children=[]
        )
      ]
    ), 
    MetricCollection(
      name=ApiCallAttempt, 
      metrics=[
        MetricRecord(metric=BackoffDelayDuration, value=PT0.205S), 
        MetricRecord(metric=SigningDuration, value=PT0.001690817S)
      ], 
      children=[
        MetricCollection(
          name=HttpClient, 
          metrics=[
            MetricRecord(metric=HttpClientName, value=Apache)
          ], 
          children=[]
        )
      ]
    )
  ]
)

We can see that the ApiCallAttempt metrics does not include the ServiceCallDuration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestA feature should be added or improved.p3This is a minor priority issuesdk-metrics

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions