Closed
Description
A big thanks to @eisbilir for catching and providing details for this issue:
Issue details
When we create a RB with dates, these ES related events will occur at taas-api :
- Send RB creation event to bus-api
- Send WP creation events to bus-api
These events will always be fired with this order.
Kafka guarantees the order of data at the partition level. The producer sends the messages in a specific order, the broker writes to the partition in that order, and the consumers read the data in that order.
According to this logic we expect the ES processor to receive these messages in the same order.
Even though we use 1 partition, sometimes WP creation message comes before RB's creation message.
These messages will run sequentially because we have 1 partition. And WP creation will fail because RB hasn't been created yet.
This is rare but happens. When I read the timestamp of the message, It's clear that the RB event was fired first. But the ES processor received them in reverse order.
Well, I suspect 2 possibilities; - Even though we have 1 partition per topic, these events are related to two different topics. So, it might not be guaranteed to receive messages in the same order when they come from different topics.
- Bus api wrapper. https://github.com/topcoder-platform/tc-bus-api-wrapper/blob/f8cbd335a0e0b4d6edd7cae859473593271fd97f/src/common/helper.js#L26 Is it possible that when waiting for m2m token, the order is broken due to network issues?
I am not sure, more debugging and log reading is needed.
I just wanted to share this experience with you, because i think it would be good to have this info for future bugs, because this scenario happens with a probability of 1/20~30 for me.
Issue reason
This happens because the order of messages from different topics is not guaranteed by Kafka, so even if we send them in the correct order there is no guarantee they would come in that order in the consumer.