-
Notifications
You must be signed in to change notification settings - Fork 885
JAVA-3142: Ability to optionally specify remote dcs for deterministic failovers when remote dcs are used in query plan #1896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JAVA-3142: Ability to optionally specify remote dcs for deterministic failovers when remote dcs are used in query plan #1896
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your proposal. I think the idea of preferred remote dcs makes sense, however I think the proposed implementation could be simplified.
core/src/main/java/com/datastax/oss/driver/api/core/config/TypedDriverOption.java
Outdated
Show resolved
Hide resolved
.../main/java/com/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicy.java
Outdated
Show resolved
Hide resolved
.../main/java/com/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicy.java
Outdated
Show resolved
Hide resolved
int remoteNodesLength = remoteNodes.length; | ||
if (remoteNodesLength == 0) { | ||
return EMPTY_NODES; | ||
if (!preferredRemoteLocalDcs.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this unnecessarily complex? I think the same effect could be achieved with a sorted stream:
Object[] remoteNodes =
liveNodes.dcs().stream()
.filter(Predicates.not(Predicates.equalTo(localDc)))
.sorted(
Comparator.comparingInt(
dc -> {
int i = Arrays.binarySearch(preferredRemoteDcs, dc);
return i < 0 ? Integer.MAX_VALUE : i;
}))
.flatMap(dc -> liveNodes.dc(dc).stream().limit(maxNodesPerRemoteDc))
.toArray();
And indeed, in presence of preferred dcs, it probably doesn't make sense to shuffle the remote nodes:
if (preferredRemoteDcs.length == 0) {
shuffleHead(remoteNodes, remoteNodesLength);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the nodes within the same preferred remote DC be shuffled? For eg: If node1, node2 are in remote dc1 and node3, node4 in remote dc2 and maxNodesPerRemoteDc=2 and preference order specified is dc1, dc2. In this example, the remote nodes in query plan will always be: node1, node2, node3, node4. Does it make sense to shuffle the nodes within a preferred remote DC, i.e it can be ((node2, node1), (node4, node3))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the big question here. I'm hesitating on the practical usefulness vs added complexity, but I think it makes sense to do as you suggest. If you take my code snippet above, this should be easily doable by changing the flatMap
operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However I realize I owe you an apology: using Arrays.binarySearch
is wrong because preferredRemoteDcs
is not necessarily sorted. We need preferredRemoteDcs
to become a List<String>
again and use indexOf
instead:
.sorted(
Comparator.comparingInt(
dc -> {
int i = preferredRemoteDcs.indexOf(dc);
return i < 0 ? Integer.MAX_VALUE : i;
}))
Sorry for that 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adutra Thank you for pointing it out. I have addressed the code review comments. Please take a look.
.../main/java/com/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicy.java
Outdated
Show resolved
Hide resolved
...om/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicyRemoteDcTest.java
Outdated
Show resolved
Hide resolved
...om/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicyRemoteDcTest.java
Outdated
Show resolved
Hide resolved
Thanks @adutra for the insightful code review comments, really appreciate it. I'll address all of them and have an open question/clarification posted above. |
Thanks @adutra for the code review, have addressed all the review comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with just a few minor suggestions. The problem is, I don't feel comfortable merging any PRs until we have a proper CI infrastructure. Maybe @absurdfarce knows where we stand right now wrt CI and all that jazz?
.../main/java/com/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicy.java
Outdated
Show resolved
Hide resolved
.../main/java/com/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicy.java
Outdated
Show resolved
Hide resolved
Thanks @adutra for the code review and the suggestions, incorporated both of them. Would wait for @absurdfarce to let us know about CI tests. |
Hey @adutra @absurdfarce Checking in, would the PR be accepted? |
7270dc3
to
f642773
Compare
Hi @nitinitt sorry for the late reply. I'm syncing with @absurdfarce and others in order to merge all the ready PRs, but that is taking some time because of huge backlog and some technical issues. But we'll eventually get to yours. We'd need a second committer to have a look. Please be patient :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding an enthusiastic +1 here, thank you!
I can see this having a lot of value in High availability 3 DC setups where folks have a replication configuration of 2:2:2. In the event of local DC failure, it is desirable to fail over to the closest DC. For example, imagine you have data centers deployed in AWS in us-east-1 (virginia), us-east-2 (ohio) and us-west-1 (california).
If you have an application that considers us-east-1 local and that DC fails, failing over to us-east-2 would be desirable over us-west-1.
core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/api/core/config/TypedDriverOption.java
Outdated
Show resolved
Hide resolved
Apologies @nitinitt , I know this has taken a long time to get to... I really appreciate your patience. Overall I'm pretty positive on the change but I'm a bit concerned about one aspect of the implementation. As currently written the changes intermingle the "preferred DC" and "no preferred DC" cases when (maybe) adding remote DC failover nodes. I'd rather make that division much cleaner so that it's very clear which part of the code does what. I've taken a stab at a set of changes on top of this PR (as of this writing) to show that what might look like. Would you mind taking a look and letting me know what you think? Thanks again! And thanks to @adutra and @tolbertam for their reviews as well! |
Thanks @tolbertam and @absurdfarce for taking the time and helping to simplify the code with refactored PR. I am not sure how the changes would be integrated, or should I sync this PR with branch: https://github.com/absurdfarce/cassandra-java-driver/tree/java3142-refactor ? |
@nitinitt Probably the most straightforward way to incorporate the changes would be to get a patch file from the other PR and then apply that patch to this PR. You can get the patch file with something like the following command:
You can then apply the changes using git. If you want to just bring in the changes and commit them yourself you can use "git apply":
Or you can directly create new commits in your repo for every commit in the patch file using "git am":
If you're at all unsure about what to do it's probably best to start with "git apply". The changes will be applied to the files but nothing will be committed; you can always do a "git reset" to get your repo back to the state of your last commit. Note that if you have any other local changes "git reset" will remove those as well so be careful when you use it! I just tested this locally and a patch file built of of commit this commit applies cleanly to a working branch containing the same commits you have in your PR branch. |
Thank you @absurdfarce for the steps, I was able to get the patch and apply it over the last commit using: |
Excellent, that looks great @nitinitt! Thanks for jumping on that so quickly! As for your question: we do indeed plan on merging this ticket before the next release (version 4.18.1). In fact, according to our planning document, this PR is the last item we planned to get in for 4.18.1. So once we're done here we should be ready to go. The last thing we need to do is to squash all the commits on this feature branch down to a single commit. You can do that with git via an interactive rebase. There's a decent article that walks through the steps here; you're specifically interested in this section. Once you've successfully squashed the commits you'll need to do a force push to your repository; you're actually re-writing your git history via this process so a force push is required. In your case the commands should look something like the following:
I've run this rebase operation locally a few times now against 4.x and I can confirm the patch applies cleanly. You may also want to consider saving a patch for the current (i.e. final) state of this branch so that we can re-create it if necessary. You can do that with the following command (similar to what you used before):
I've also done this so I also have a copy of all the changes on this feature branch. Finally, regarding the commit message of your squashed commit; this should look something like the following:
If you have any questions about any of the above comments please don't hesitate to ask! |
55b86c7
to
33fad7a
Compare
…onfiguration for graceful automatic failovers patch by Nitin Chhabra; reviewed by Alexandre Dutra, Andy Tolbert, and Bret McGuire for JAVA-3142
33fad7a
to
a3bd191
Compare
Thanks @absurdfarce for the guidance and the patience. I have squashed and force pushed and here is how git log looks:
Sorry was struggling with formatting in the git message little bit hence you see extra force push. |
@@ -574,6 +574,11 @@ datastax-java-driver { | |||
# Modifiable at runtime: no | |||
# Overridable in a profile: yes | |||
allow-for-local-consistency-levels = false | |||
# Ordered preference list of remote dc's (in order) optionally supplied for automatic failover. While building a query plan, the driver uses the DC's supplied in order together with max-nodes-per-remote-dc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should be a space between this and allow-for-local-consistency-levels in order to make it clear we're talking about two distinct configs
@@ -574,6 +574,11 @@ datastax-java-driver { | |||
# Modifiable at runtime: no | |||
# Overridable in a profile: yes | |||
allow-for-local-consistency-levels = false | |||
# Ordered preference list of remote dc's (in order) optionally supplied for automatic failover. While building a query plan, the driver uses the DC's supplied in order together with max-nodes-per-remote-dc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nit: might be worth mentioning that users aren't required to specify all DCs when listing preferences via this config.
Thanks @nitinitt, this looks great! I concur that this commit looks like it contains all the fixes we've been discussing so we are ready to get this guy in! Big thank you for all your help in making this happen! |
Thanks @absurdfarce! I have addressed the 2 minor documentation comments as part of: https://github.com/apache/cassandra-java-driver/pull/1933/files |
map.put( | ||
TypedDriverOption.LOAD_BALANCING_DC_FAILOVER_PREFERRED_REMOTE_DCS, ImmutableList.of("")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nitinitt . Thanks for this contribution, very useful.
While testing it on my side, I've stumbled upon this default (we use OptionsMap for configuring the driver, but I see the same default is in the reference.conf
). With this default, we never go to buildRemoteQueryPlanAll
and fall into buildRemoteQueryPlanPreferred
, which obviously cannot find nodes of the DC ""
.
Was this set intentionally to default to a List with empty string for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jahstreet
Is the code empty check: https://github.com/apache/cassandra-java-driver/pull/1896/files#diff-bbdb6181132c9a28e06a98058d5b3db08d3398ffb4233f242f8da44b26a9249bR333-R336
if (preferredRemoteDcs.isEmpty()) { return new CompositeQueryPlan(local, buildRemoteQueryPlanAll()); }
not getting invoked?
Apologies for late reply, the intention was that the isEmpty() check prevents this condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually doesn't, cause there is an empty string in the array.
Ability to specify ordering of remote dc's via new configuration for deterministic failovers.
https://datastax-oss.atlassian.net/browse/JAVA-3142
Motivation
The driver uses the remote contact points(in no particular order) when a Query plan is built if max-nodes-per-remote-dc is used. Explicitly specifying the remote dc’s in order will help in more deterministic failovers in case of outages/degradation with ring experienced with ring associated with: basic.load-balancing-policy.local-datacenter. There could also be cases where more than 1 ring is present in DC, and in case of any issues with one ring associated with basic.load-balancing-policy.local-datacenter(localDc), the driver can failover to other rings present in same DC via this new feature. In case of 1 ring per DC, applications can choose remote dc's deterministically when query plan is built.
Modifications
Added a new configuration: advanced.load-balancing-policy.dc-failover.preferred-remote-dcs which optionally takes a list of preferred remote-dcs(in order) used in building the query plan. Backward compatibility is maintained, if the config is not present.
Result
If this configuration is enabled, the order specified in the config will be used in building query plan.