Description
Expected Behavior
Sending messages with SqsAsyncClient.send(req)
should return a Future that eventually completes.
Current Behavior
Sending thousands of SQS messages in quick succession triggers a very rare case of single send(req)
never completing the Future. No result, no error, just disappears.
Possible Solution
There have been Netty-related issues in this space before, so I would look there first. With a previous version of the SDK (2.4.14) this was more common than in the 2.5 version.
Steps to Reproduce (for bugs)
Send a few hundred thousand SQS messages using the SqsAsyncClient.send method and connect a Future that times out (say, after 10 seconds) with the returned future and see which one completes first.
A Scala example for retrying the send after a timeout.
private def sendRetry(request: SendMessageRequest, retryCount: Int = 3): Future[SendMessageResponse] = {
val res = sqs.sendMessage(request).toScala
val timeout = APIErrorJVM.delayFuture[SendMessageResponse](Failure(new TimeoutException()), 10.seconds)
Future.firstCompletedOf(List(res, timeout)) recoverWith {
case _: TimeoutException if retryCount > 0 =>
log.error(s"Timeout while sending message $request, retry count = $retryCount")
sendRetry(request, retryCount - 1)
}
}
Context
SQS is used as ground truth in our application, and if sending SQS messages just invisibly fails, the whole application logic is in jeopardy. Had to add an application level timeout to the SDK call to circumvent this.
Your Environment
- AWS Java SDK version used: 2.5.25
- JDK version used: 1.8.0 172
- Operating System and version: Linux in AWS