Skip to content

Support underflow handling without thread sleep #701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

YuryAndr
Copy link

@YuryAndr YuryAndr commented Jul 29, 2021

Proposed Changes

Purpose of PR is to fix bug -#700

According to https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html we should perform read next bytes of message when SSLEngineResult indicates BUFFER_UNDERFLOW state.

In versions before 5.*.* it was fixed using retries with Thread.sleep() https://github.com/rabbitmq/rabbitmq-java-client/blob/87aae532705490fa429afe34326a0ebbdd83e1d3/src/main/java/com/rabbitmq/client/impl/nio/NioHelper.java

But versions 5.*.* don't include this fix. Also It is not good idea to use Thread.sleep() on nio thred.

Current PR provides fetching chunks of bytes using standard NIO flow. SocketChannelFrameHandlerState provides isUnderflowHandlingEnabled attribute to track current state.

@acogoluegnes
Copy link
Contributor

You beat me to it, I just pushed #702. You got the problem right, but I think #702 keeps the changes more local, WDYT?

@YuryAndr
Copy link
Author

YuryAndr commented Jul 29, 2021

This PR takes into account cipherIn.clear() before next chunk reading at SocketChannelFrameHandlerStat. So there is no losses of previous chunks.

UPD It was realized that actually both PRs provide chunks handling without losses. But in case of #702 continueReading returns true due to the cipherIn.hasRemaining() == true. And it leads to reading from buffer in loop without checking selector.keys().

Copy link
Contributor

@acogoluegnes acogoluegnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice contribution, thanks! Do you think it would be possible to get rid of the isUnderflowHandlingEnabled AtomicBoolean, keep the information in the frame builder, and expose with a package-protected method? The state could reset the flag with another package-protected method. This would avoid to use an atomic class (we don't need the guarantees it provides as this is all mono-threaded).

Comment on lines 183 to 189
if (!isUnderflowHandlingEnabled.get()) {
cipherIn.clear();
cipherIn.flip();
}

cipherIn.flip();
plainIn.clear();
plainIn.flip();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the need of these operations?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cipherIn should aggregate chunks of bytes on each read. So we should not clear it before each chunk reading.

Copy link
Author

@YuryAndr YuryAndr Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in my implementation continueReading returns false when there is no data for next chunk. So there is no infinite loop around while(continueReading()) while data is not present in buffer. NIO thread can check selector state before trying to get bytes.

}
cipherBuffer.flip();
break;
underflowHandling = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, removing some reading logic and putting it in the class that was already doing, nice.

@YuryAndr
Copy link
Author

Do you think it would be possible to get rid of the isUnderflowHandlingEnabled AtomicBoolean

isUnderflowHandlingEnabled was converted to private attribute of SslEngineFrameBuilder

@michaelklishin
Copy link
Contributor

@YuryAndr thank you for contributing. We will back port after testing but you are welcome to build this client from source and try it out on your actual workload one more time ;)

@acogoluegnes
Copy link
Contributor

acogoluegnes commented Jul 30, 2021

@YuryAndr 5.13.1-SNAPSHOT is available if you want to give it a try. I'll release 5.13.1 if it works for you.

@YuryAndr
Copy link
Author

YuryAndr commented Jul 30, 2021

@acogoluegnes Thank you! It was verified over our setup.

Also it was found that PR had fixed another related issue.

It was noticed that error in BUFFER_UNDERFLOW state handling could lead to out of memory error.

Exception in thread "rabbitmq-nio" java.lang.OutOfMemoryError: Java heap space
	at com.rabbitmq.client.impl.nio.FrameBuilder.readFrame(FrameBuilder.java:96)
	at com.rabbitmq.client.impl.nio.NioLoop.run(NioLoop.java:156)
	at java.base/java.lang.Thread.run(Thread.java:829)

Because in case of 5.13.0 there is no BUFFER_UNDERFLOW state condition checking but only hasRemaining calls on buffers. So FrameBuilder tries to read random data from applicationBuffer and tries to create byte array of random size. For example - about 1.7 GB:

image

It was also verified that there are no such memory issues on 5.13.1-SNAPSHOT.

@YuryAndr
Copy link
Author

YuryAndr commented Aug 2, 2021

@acogoluegnes Could you please also provide patch for 5.7.* version? Because we have modules with dependency on Spring 5.2.* version and spring-amqp 2.2.* https://github.com/spring-projects/spring-amqp/blob/2.2.x/build.gradle And it depends on 5.7.3. version of java client.

@michaelklishin
Copy link
Contributor

We generally do not release patches for older minor series, in particular six series behind. @YuryAndr if you are willing to do all the backporting and testing work for 5.7.x, we can cut a new release.

@michaelklishin
Copy link
Contributor

So yeah, in addition to backporting to 5.13.x-stable we'd have to backport to 5.7.x-stable and then produce a release.
I don't know if we have the pipelines to produce 5.7.x releases but I guess we can generate a one-off.

@YuryAndr YuryAndr mentioned this pull request Aug 2, 2021
@YuryAndr
Copy link
Author

YuryAndr commented Aug 2, 2021

Next PR provides cherry picked commits for 5.7.x #703

@acogoluegnes
Copy link
Contributor

OK, I'll see what I can to release a 5.7.x, but it won't be before several weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants