Skip to content

CASSANDRA-19635 - Run integration tests with C* 5.x #1934

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 3, 2024

Conversation

lukasz-antoniak
Copy link
Member

JIRA link: https://issues.apache.org/jira/browse/CASSANDRA-19635

PR contains necessary changes to run integration tests with Apache Cassandra 5.0-beta1. Developers can run the tests using:

$ mvn clean install -Dccm.version=5.0-beta1 -DtestJavaHome=/opt/java17/home -Ptest-jdk-17

Make sure you have latest CCM version installed locally and available in $PATH.

Copy link
Contributor

@tolbertam tolbertam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments/suggestions. This seems to be working good for me when I use JDK 17 as my primary JVM.

I'm still running through the suite locally, but leaving feedback early as I wait on that to complete.

* several schema changes at once. Limit concurrently executed DDLs to 5.
*/
public class SchemaChangeSynchronizer {
private static final Semaphore lock = new Semaphore(5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we shouldn't just set this to 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still wanted to preserve some parallelism in builds with lower C* versions. Also did not want to make this logic C* version dependent, just to simplify things. If you wish I am happy to change it to 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that concurrent schema changes happen to 'work' in older versions is probably more by luck than anything. I think it likely would not slow down things noticeable if we set it to 1, and its a lot more appropriate (have seen too many incidents caused by concurrent schema changes, its always nice to enforce 1 at a time). That said, i'm cool with merging and leaving it at 5 as well, since it's better than what it does currently 👍

import org.junit.rules.RuleChain;
import org.junit.rules.TestRule;

@Category(ParallelizableTests.class)
// Do not run LWT tests in parallel because they may interfere. Tests operate on the same row.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment isn't applicable right? Probably can remove it as I think the annotation description describes it well.

Copy link
Contributor

@tolbertam tolbertam May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, there are LWT specific tests in here, so this comment is actually applicable (should_delete_if_exists_reactive for example)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was a flaky test that I have fixed while running complete IT set multiple times a day :).

@@ -38,7 +38,7 @@ public abstract class BaseCcmRule extends CassandraResourceRule {
new Thread(
() -> {
try {
ccmBridge.remove();
ccmBridge.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change from remove to close? Possible this was changed to check directory outputs after but forgot to change back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made CcmBridge#close() dependent on created flag. Now we execute ccm remove only once if the cluster was previously created and not removed after that. If you run FT and log CCM command executions, you would see that we clean-up cluster twice (execute ccm remove):

  1. In BaseCcmRule#after().
  2. As part of AutoCloseable and resource close() method.

Of course running ccm remove twice does not do any harm. IMO this fix is most about taste, so I can revert this back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you've explained it, it makes sense to me, +1 for keeping it this way.

Copy link
Contributor

@tolbertam tolbertam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is excellent, thank you @lukasz-antoniak !

// Do not run LWT tests in parallel because they may interfere. Tests operate on the same row.
@BackendRequirement(
type = BackendType.CASSANDRA,
description = "SASI indexes broken in DSE. See InventoryITBase#BROKEN_SASI_VERSION.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too strong a constraint. IIRC SASI indexes were borked in a specific DSE release. The commit which introduced BROKEN_SASI_VERSION has a lot of useful context here, specifically the definition of isSasiBroken() which checks for an exact DSE version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a wrong impression that there are more DSE versions that are actually broken. Hereby integration tests for deletion of entities, we do not want to run in parallel, because they may interfere one with another. Shared CcmBridge can be used only with @Category(ParallelizableTests.class), which we do not want. CustomCcmRule is mostly used together with @BackendRequirement. I had a problem with this test running agains DSE, because enable_sasi_indexes is an invalid configuration parameter.

So to make this test work for C*, I had to set enable_sasi_indexes = true, but not for DSE.

After further debugging, I have found the root cause - DefaultCcmBridgeBuilderCustomizer is not applied to CustomCcmRule. I shall not reuse DefaultCcmBridgeBuilderCustomizer, because it adds other parameters that causes other integration tests to break (once using CustomCcmRule). In the end, I have implemented a dedicated configureCcm method inside the tests. enable_sasi_indexes is required for one delete statement to succeed inside this test class, so I think we can leave configuration local.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me; it fits nicely with the pattern of each test being responsible for whatever customization it needs to operate.

@absurdfarce
Copy link
Contributor

Agree with @tolbertam ; well done @lukasz-antoniak! There are a few nits I'd like to see cleaned up but once those are in place I think I'm a 👍 as well.

Copy link
Contributor

@absurdfarce absurdfarce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @lukasz-antoniak! With your most recent changes I think we're there!

@absurdfarce
Copy link
Contributor

Excellent, thanks for making those final changes so quickly @lukasz-antoniak! With your most recent work I believe we have all the code changes we need to get this merged!

The next step is to consolidate all of these changes down to a single commit. If it's useful we've provided some fairly detailed instructions on how to do this in a comment on a prior PR. In your case the commit message should read something like the following:

CASSANDRA-19635: Run integration tests with C* 5.x
    
patch by Lukasz Antoniak; reviewed by Andy Tolbert, and Bret McGuire for CASSANDRA-19635

Feel free to ask questions if any part of that process isn't clear!

patch by Lukasz Antoniak; reviewed by Andy Tolbert, and Bret McGuire for CASSANDRA-19635
@absurdfarce
Copy link
Contributor

Yup, this looks good to me. Thanks @lukasz-antoniak!

@absurdfarce absurdfarce merged commit 432e107 into apache:4.x Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants