Differentiate back-off exceptions from 'real' application errors in Listener Micrometer timer metrics for retry topics

**Expected Behavior**

As per the [docs on 'Monitoring Listener Performance'](https://docs.spring.io/spring-kafka/docs/latest-ga/reference/html/#monitoring-listener-performance), there are `Micrometer` timers called `spring.kafka.listener` which are tagged with a `result` (`success` or `failure`) and `exception`. I would expect the metrics generated with the `failure` tag to capture true failures (e.g. an `IOException` from some resource that is used to process records). Any back-off exceptions, which are _expected_ to occur for topics with a delay configured, should be treated separately, e.g. with a different tag value for `result` or `exception`.

**Current Behavior**

A [`failure` timer is recorded](https://github.com/spring-projects/spring-kafka/blob/10905dc5b3b47abf9364769d9f4e4df0b17e39a8/spring-kafka/src/main/java/org/springframework/kafka/listener/KafkaMessageListenerContainer.java#L2509) whenever a `RuntimeException` occurs while processing a record. When dealing with retry topics, this includes a `KafkaBackoffException` which may be thrown inside `invokeOnMessage` (or the batch equivalent) when the listener determines that the timestamp of the latest record is not ready to be processed yet. The `exception` is always recorded as `ListenerExecutionFailedException` so there is no way to differentiate back-off exceptions from other exceptions.

**Context**

I would like to analyze the listener metrics to gain insight into failures (how often they happen, the performance impact, etc.), but I'm interested in application logic failures (e.g. database is unavailable) rather than expected framework level failures (back-off exceptions). I was surprised to see my metrics indicating many failures despite the application logs showing that all records were successfully processed until I realized that the failures must actually be due to these `KafkaBackoffException`s.

I could implement my own timers/metrics inside my `KafkaListener`, but I would prefer to be able to use the existing timers that are provided by the framework.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Differentiate back-off exceptions from 'real' application errors in Listener Micrometer timer metrics for retry topics #2237

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Differentiate back-off exceptions from 'real' application errors in Listener Micrometer timer metrics for retry topics #2237

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions