Skip to content

Commit 6c8e92e

Browse files
concepts: Document caveats with HPAs and PDBs (#461)
* concepts: Document the use of HPAs * concepts: Document caveats with HPAs and PDBs * Add PDB guidelines * typo * Update modules/concepts/pages/operations/pod_disruptions.adoc Co-authored-by: Malte Sander <malte.sander.it@gmail.com> * Update modules/concepts/pages/operations/pod_disruptions.adoc Co-authored-by: Malte Sander <malte.sander.it@gmail.com> * Update modules/concepts/pages/operations/pod_disruptions.adoc Co-authored-by: Malte Sander <malte.sander.it@gmail.com> * Address review feedback * Apply suggestions from code review Co-authored-by: Malte Sander <malte.sander.it@gmail.com> * fix links * newlines * trigger ci --------- Co-authored-by: Malte Sander <malte.sander.it@gmail.com>
1 parent 6747b17 commit 6c8e92e

File tree

3 files changed

+53
-27
lines changed

3 files changed

+53
-27
lines changed

modules/concepts/pages/opa.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ The automatic connection is facilitated by the xref:service_discovery.adoc[servi
8181

8282
Read more about the xref:opa:index.adoc[]. Read more about product integration with OPA for these products:
8383

84-
* xref:trino:usage_guide/security.adoc#_authorization[Trino]
84+
* xref:trino:usage-guide/security.adoc#_authorization[Trino]
8585
* xref:kafka:usage.adoc[Kafka]
8686
* xref:druid:usage-guide/security.adoc#authorization[Druid]
8787

modules/concepts/pages/operations/index.adoc

Lines changed: 30 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -7,18 +7,19 @@ It provides you with the necessary details to operate it in a production environ
77

88
Make sure to go through the following checklist to achieve the maximum level of availability for your services.
99

10-
1. Make setup highly available (HA): In case the product supports running in an HA fashion, our operators will automatically
11-
configure it for you. You only need to make sure that you deploy a sufficient number of replicas. Please note that
12-
some products don't support HA.
13-
2. Reduce the number of simultaneous pod disruptions (unavailable replicas). The Stackable operators write defaults
14-
based upon knowledge about the fault tolerance of the product, which should cover most of the use-cases. For details
15-
have a look at xref:operations/pod_disruptions.adoc[].
16-
3. Reduce impact of pod disruption: Many HA capable products offer a way to gracefully shut down the service running
17-
within the Pod. The flow is as follows: Kubernetes wants to shut down the Pod and calls a hook into the Pod, which in turn
18-
interacts with the product, telling it to gracefully shut down. The final deletion of the Pod is then blocked until
19-
the product has successfully migrated running workloads away from the Pod that is to be shut down. Details covering the graceful shutdown mechanism are described in the actual operator documentation.
10+
1. Make setup highly available (HA): In case the product supports running in an HA fashion, our operators will automatically configure it for you.
11+
You only need to make sure that you deploy a sufficient number of replicas.
12+
Please note that some products don't support HA.
13+
2. Reduce the number of simultaneous pod disruptions (unavailable replicas).
14+
The Stackable operators write defaults based upon knowledge about the fault tolerance of the product, which should cover most of the use-cases.
15+
For details have a look at xref:operations/pod_disruptions.adoc[].
16+
3. Reduce impact of pod disruptions:
17+
Many HA capable products offer a way to gracefully shut down the service running within the Pod.
18+
The flow is as follows: Kubernetes wants to shut down the Pod and calls a hook into the Pod, which in turn interacts with the product, signaling it to gracefully shut down.
19+
The final deletion of the Pod is then blocked until the product has successfully migrated running workloads away from the Pod that is to be shut down.
20+
Details covering the graceful shutdown mechanism are described in the actual operator documentation.
2021
+
21-
WARNING: Graceful shutdown is not implemented for all products yet. Please check the documentation specific to the product operator to see if it is supported (such as e.g. xref:trino:usage_guide/operations/graceful-shutdown.adoc[the documentation for Trino].
22+
WARNING: Graceful shutdown is not implemented for all products yet. Please check the documentation specific to the product operator to see if it is supported (such as e.g. xref:trino:usage-guide/operations/graceful-shutdown.adoc[the documentation for Trino].
2223

2324
4. Spread workload across multiple Kubernetes nodes, racks, datacenter rooms or datacenters to guarantee availability
2425
in the case of e.g. power outages or fire in parts of the datacenter. All of this is supported by
@@ -27,23 +28,26 @@ WARNING: Graceful shutdown is not implemented for all products yet. Please check
2728

2829
== Maintenance actions
2930

30-
Sometimes you want to quickly shut down a product or update the Stackable operators without all the managed products
31-
restarting at the same time. You can achieve this using the following methods:
31+
Sometimes you want to quickly shut down a product or update the Stackable operators without all the managed products restarting at the same time.
32+
You can achieve this using the following methods:
3233

3334
1. Quickly stop and start a whole product using `stopped` as described in xref:operations/cluster_operations.adoc[].
34-
2. Prevent any changes to your deployed product using `reconcilePaused` as described in xref:operations/cluster_operations.adoc[].
35+
2. Prevent any changes to your deployed product using `reconciliationPaused` as described in xref:operations/cluster_operations.adoc[].
3536

3637
== Performance
3738

38-
1. You can configure the available resource every product has using xref:concepts:resources.adoc[]. The defaults are
39-
very restrained, as you should be able to spin up multiple products running on your Laptop.
40-
2. You can not only use xref:operations/pod_placement.adoc[] to achieve more resilience, but also to co-locate products
41-
that communicate frequently with each other. One example is placing HBase regionservers on the same Kubernetes node
42-
as the HDFS datanodes. Our operators already take this into account and co-locate connected services. However, if
43-
you are not satisfied with the automatically created affinities you can use ref:operations/pod_placement.adoc[] to
44-
configure your own.
45-
3. If you want to have certain services running on dedicated nodes you can also use xref:operations/pod_placement.adoc[]
46-
to force the Pods to be scheduled on certain nodes. This is especially helpful if you e.g. have Kubernetes nodes with
47-
16 cores and 64 GB, as you could allocate nearly 100% of these node resources to your Spark executors or Trino workers.
48-
In this case it is important that you https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/[taint]
49-
your Kubernetes nodes and use xref:overrides.adoc#pod-overrides[podOverrides] to add a `toleration` for the taint.
39+
1. *Compute resources*: You can configure the available resource every product has using xref:concepts:resources.adoc[].
40+
The defaults are very restrained, as you should be able to spin up multiple products running on your Laptop.
41+
2. *Autoscaling*: Although not supported by the platform natively yet, you can use
42+
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale[HorizontalPodAutoscaler] to autoscale the number of Pods running for a given rolegroup dynamically based upon resource usage.
43+
To achieve this you need to omit the number of replicas on the rolegroup to be scaled, which in turn results in the created StatefulSet not having any replicas set as well.
44+
Afterwards you can deploy a HorizontalPodAutoscaler as usual.
45+
Please note that not all product-operators have implemented graceful shutdown, so the product might be disturbed during scale down.
46+
Later platform versions will support autoscaling natively with sensible defaults and will deploy HorizontalPodAutoscaler objects for you.
47+
3. *Co-location*: You can not only use xref:operations/pod_placement.adoc[] to achieve more resilience, but also to co-locate products that communicate frequently with each other.
48+
One example is placing HBase regionservers on the same Kubernetes node as the HDFS datanodes.
49+
Our operators take this into account and co-locate connected services by default.
50+
If you are not satisfied with the automatically created affinities you can use xref:operations/pod_placement.adoc[] to configure your own.
51+
4. *Dedicated nodes*: If you want to have certain services running on dedicated nodes you can also use xref:operations/pod_placement.adoc[] to force the Pods to be scheduled on certain nodes.
52+
This is especially helpful if you e.g. have Kubernetes nodes with 16 cores and 64 GB, as you could allocate nearly 100% of these node resources to your Spark executors or Trino workers.
53+
In this case it is important that you https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/[taint] your Kubernetes nodes and use xref:overrides.adoc#pod-overrides[podOverrides] to add a `toleration` for the taint.

modules/concepts/pages/operations/pod_disruptions.adoc

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,28 @@ The defaults depend on the individual product and can be found below the "Operat
1515
They are based on our knowledge of each product's fault tolerance.
1616
In some cases they may be a little pessimistic, but they can be adjusted as documented in the following sections.
1717

18+
In general, product roles are split into the following two categories, which serve as guidelines for the default values we apply:
19+
20+
=== Multiple replicas to increase availability
21+
22+
For these roles (e.g. ZooKeeper servers, HDFS journal + namenodes or HBase masters), only a single Pod is allowed to be unavailable.
23+
For example, imagine a cluster with 7 ZooKeeper servers, where 4 servers are required to form a quorum and healthy ensemble.
24+
By allowing 2 servers to be unavailable, there is no single point of failure (as there are at least 5 servers available).
25+
But there is only a single spare server left. The reason to choose 7 instead of e.g. 5 ZooKeeper servers might be, that there are always at least 2 spare servers.
26+
Increasing the number of allowed disruptions and increasing the number of replicas is not improving the general availability.
27+
28+
=== Multiple replicas to increase performance
29+
30+
For these roles (e.g. HDFS datanodes, HBase regionservers or Trino workers), more than a single Pod is allowed to be unavailable. Otherwise, rolling re-deployments may take very long.
31+
32+
IMPORTANT: The operators calculate the number of Pods for a given role by adding the number of replicas of every rolegroup that is part of that role.
33+
34+
In case there are no replicas defined on a rolegroup, one Pod will be assumed for this rolegroup, as the created Kubernetes objects (StatefulSets or Deployments) will default to a single replica as well.
35+
However, in case there are https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/[HorizontalPodAutoscaler] in place, the number of replicas of a rolegroup can change dynamically.
36+
In this case the operators might falsely assume that rolegroups have fewer Pods than they actually have.
37+
This is a pessimistic approach, as the number of allowed disruptions normally stays the same or even increases when the number of Pods increases.
38+
This should be safe, but in some cases more Pods *could* have been allowed to be unavailable which may increase the duration of rolling re-deployments.
39+
1840
== Influencing and disabling PDBs
1941

2042
You can configure

0 commit comments

Comments
 (0)