Description
Current behavior:
WebClient long task timer task active metrics does not show "destination host" in "client_name" label, instead always shows "none".
http_client_requests_active_seconds_active_count{client_name="none",exception="none",method="none",outcome="UNKNOWN",status="CLIENT_ERROR",uri="none",} 7.0
In service which has more than 1 webClient instance to different remote host all long task timer active metrics are aggregated and metrics cannot detect the number of pending requests per destination host. This decrease the value of the metrics as breakdown per destination host calls cannot be done.
When MVC is used instead of webflux then "client_name" label contains correct destination host".
http_client_requests_active_seconds_active_count{client_name="serviceB.local",exception="none",method="none",outcome="UNKNOWN",status="CLIENT_ERROR",uri="none",} 5.0
http_client_requests_active_seconds_active_count{client_name="serviceC.local",exception="none",method="none",outcome="UNKNOWN",status="CLIENT_ERROR",uri="none",} 2.0
Expected behavior:
webClient should report "client_name" label correctly matching remote destination address shown in normal metrics.
Potential root cause:
DefaultWebClient exchange() store the requestsBuilder() to carrier() before observation start, but "request" is stored to context only after observation is started just before applying the filter.
DefaultClientRequestObservationConvention/clientName() reads the "client_name" from context when long task timer is started and get "null" because context is not set. Thus "client_name" is not available for long task timer active start but only for normal metrics stop.
MVC works because "client_name" is read from carrier.
There were some other changes in DefaultWebClient blame but I am not sure if those change or broke the the "client_name" because context is still set after start.
public class DefaultClientRequestObservationConvention implements ClientRequestObservationConvention {
protected KeyValue clientName(ClientRequestObservationContext context) {
if (context.getRequest() != null && context.getRequest().url().getHost() != null) {
return KeyValue.of(LowCardinalityKeyNames.CLIENT_NAME, context.getRequest().url().getHost());
}
return CLIENT_NAME_NONE;
}
}
final class DefaultWebClient implements WebClient {
@SuppressWarnings("deprecation")
@Override
public Mono<ClientResponse> exchange() {
ClientRequestObservationContext observationContext = new ClientRequestObservationContext();
ClientRequest.Builder requestBuilder = initRequestBuilder();
return Mono.deferContextual(contextView -> {
Observation observation = ClientHttpObservationDocumentation.HTTP_REACTIVE_CLIENT_EXCHANGES.observation(observationConvention,
DEFAULT_OBSERVATION_CONVENTION, () -> observationContext, observationRegistry);
observationContext.setCarrier(requestBuilder); // carrier available in start
observation
.parentObservation(contextView.getOrDefault(ObservationThreadLocalAccessor.KEY, null))
.start();
ExchangeFilterFunction filterFunction = new ObservationFilterFunction(observationContext);
if (filterFunctions != null) {
filterFunction = filterFunctions.andThen(filterFunction);
}
ClientRequest request = requestBuilder.build();
observationContext.setUriTemplate((String) request.attribute(URI_TEMPLATE_ATTRIBUTE).orElse(null));
observationContext.setRequest(request); // context saved after start
Mono<ClientResponse> responseMono = filterFunction.apply(exchangeFunction)
.exchange(request)
.checkpoint("Request to " + this.httpMethod.name() + " " + this.uri + " [DefaultWebClient]")
.switchIfEmpty(NO_HTTP_CLIENT_RESPONSE_ERROR);
if (this.contextModifier != null) {
responseMono = responseMono.contextWrite(this.contextModifier);
}
final AtomicBoolean responseReceived = new AtomicBoolean();
return responseMono
.doOnNext(response -> responseReceived.set(true))
.doOnError(observationContext::setError)
.doFinally(signalType -> {
if (signalType == SignalType.CANCEL && !responseReceived.get()) {
observationContext.setAborted(true);
}
observation.stop();
})
.contextWrite(context -> context.put(ObservationThreadLocalAccessor.KEY, observation));
});
}
}
Possible customization workaround is reading the clientName from context.getCarrier().build() instead from context. But not clear if this is safe way in case webClient implementation change again.
@Configuration
class NdfClientRequestObservationConvention extends DefaultClientRequestObservationConvention {
private static final KeyValue CLIENT_NAME_NONE = KeyValue.of(ClientHttpObservationDocumentation.LowCardinalityKeyNames.CLIENT_NAME, KeyValue.NONE_VALUE);
@Override
protected KeyValue clientName(ClientRequestObservationContext context) {
if (context.getRequest() != null && context.getRequest().url().getHost() != null) {
return KeyValue.of(ClientHttpObservationDocumentation.LowCardinalityKeyNames.CLIENT_NAME, context.getRequest().url().getHost());
}
if (context.getCarrier() != null && context.getCarrier().build().url().getHost() != null) {
return KeyValue.of(ClientHttpObservationDocumentation.LowCardinalityKeyNames.CLIENT_NAME, context.getCarrier().build().url().getHost());
}
return CLIENT_NAME_NONE;
}
}
Environment:
Java: 17/21
SpringBoot 3.1.5, same issue in 3.1.6 and 3.2.0