Skip to content

Producer logging thousands of errors in a very short timespan (Broker transport failure) #1328

Open
@Atheuz

Description

@Atheuz

Description

I have a problem with a producer in an API. Specifically, once in a while it'll throw errors like SASL authentication error: SaslAuthenticateRequest failed: Local: Broker transport failure (after 0ms in state DOWN), which wouldn't be a problem on its own if it was a few times, but the problem is that whenever it happens that there's a transport failure I will get THOUSANDS of log entries, like it's retrying it every single millisecond.

For instance, here's 1½ minute where it logged that error 10 000 times:

image

I'm fine with it logging errors and retrying the request, but I don't understand why it's retrying it so frequently. I've tried fiddling with producer config to make it back off, but even with the defaults it seems like it shouldn't really be doing it that often.

Here's an example of the error:

FAIL [rdkafka#producer-1] [thrd:sasl_ssl://broker:9092/bootstrap]: sasl_ssl://broker:9092/1: SASL authentication error: SaslAuthenticateRequest failed: Local: Broker transport failure (after 0ms in state DOWN)

My producing code looks like this:

def produce_event(
    topic, event_data, kafka_producer
):
    """Produce an Event to Kafka."""
    event_id = str(uuid.uuid4())
    event_dt = datetime.datetime.utcnow()
    event = Event(
        event_id=UUID(value=event_id),
        event_datetime=Timestamp(seconds=int(event_dt.timestamp()), nanos=0),
        event_data=JsonString(content=json.dumps(event_data)),
    )
    kafka_producer.produce(topic, key=event_id, value=event, on_delivery=kafka_delivery_report)
    kafka_producer.poll()

How to reproduce

Unknown

Checklist

Please provide the following information:

  • confluent-kafka-python and librdkafka version: ('1.8.2', 17302016) and ('1.8.2', 17302271)
  • Apache Kafka broker version: 2.7.0 (on Confluent Platform 6.1.4)
  • Client configuration: { "retry.backoff.ms": 1000, "reconnect.backoff.ms": 500, "reconnect.backoff.max.ms": 5000, "key.serializer": string_serializer, "value.serializer": proto_serializer }
  • Operating system: Debian
  • Provide client logs (with 'debug': '..' as necessary)
  • Provide broker log excerpts
  • Critical issue

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions