Skip to content

Commit e7be753

Browse files
sbernaueradwk67
andauthored
Add concepts guide on graceful shutdown (#468)
* Add concepts guide on graceful shutdown * Apply suggestions from code review Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de> * Add k8s requirements * Apply suggestions from code review Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de> * fix nav --------- Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
1 parent f46a91b commit e7be753

File tree

3 files changed

+38
-3
lines changed

3 files changed

+38
-3
lines changed

modules/concepts/nav.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,10 @@
1010
** xref:resources.adoc[]
1111
** xref:s3.adoc[]
1212
** xref:tls_server_verification.adoc[]
13-
** xref:pod_placement.adoc[]
1413
** xref:overrides.adoc[]
1514
** xref:duration.adoc[]
1615
** xref:operations/index.adoc[]
1716
*** xref:operations/cluster_operations.adoc[]
18-
*** xref:operations/pod_placement.adoc[]
1917
*** xref:operations/pod_disruptions.adoc[]
18+
*** xref:operations/pod_placement.adoc[]
19+
*** xref:operations/graceful_shutdown.adoc[]
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
= Graceful shutdown
2+
3+
The article https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-best-practices-terminating-with-grace[Kubernetes best practices: terminating with grace] describes how a graceful shutdown works in Kubernetes.
4+
5+
Our operators add the needed shutdown mechanism for their products that support graceful shutdown.
6+
7+
They also configure a sensible amount of time Pods are granted to properly shut down without disrupting the availability of the product.
8+
If you are not satisfied with the default values, you can set the graceful shutdown timeout as follows:
9+
10+
[source,yaml]
11+
----
12+
spec:
13+
workers:
14+
config:
15+
gracefulShutdownTimeout: 1h # Set it for all worker roleGroups
16+
roleGroups:
17+
normal: # Will use 1h from the worker role config
18+
replicas: 1
19+
long: # Will use 6h from the roleGroup config below
20+
replicas: 1
21+
config:
22+
gracefulShutdownTimeout: 6h # Set it only for this specific roleGroup
23+
----
24+
25+
The individual default timeouts are documented in the specific operators at the `Operations -> Graceful shutdown` usage-guide.
26+
27+
== Kubernetes cluster requirements
28+
Pods need to have the ability to take as long as they need to gracefully shut down without getting killed.
29+
30+
Imagine the situation that you set the graceful shutdown period to 24 hours.
31+
In the case of e.g. an on-premise Kubernetes cluster the Kubernetes infrastructure team may want to drain the Kubernetes node so that they can do regular maintenance, such as rebooting the node.
32+
They will have some upper limit on how long they will wait for Pods on the Node to terminate before they reboot the Kubernetes node, regardless of any Pods that are still running.
33+
34+
When setting up a production cluster, you need to check with your Kubernetes administrator (or cloud provider) what time period your Pods have to terminate gracefully.
35+
It is not sufficient to have a look at the `spec.terminationGracePeriodSeconds` and come to the conclusion that the Pods have e.g. 24 hours to gracefully shut down, as e.g. an administrator can reboot the Kubernetes node before the time period is reached.

modules/concepts/pages/operations/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Make sure to go through the following checklist to achieve the maximum level of
1717
Many HA capable products offer a way to gracefully shut down the service running within the Pod.
1818
The flow is as follows: Kubernetes wants to shut down the Pod and calls a hook into the Pod, which in turn interacts with the product, signaling it to gracefully shut down.
1919
The final deletion of the Pod is then blocked until the product has successfully migrated running workloads away from the Pod that is to be shut down.
20-
Details covering the graceful shutdown mechanism are described in the actual operator documentation.
20+
Details covering the graceful shutdown mechanism are described in xref:operations/graceful_shutdown.adoc[] as well as the actual operator documentation.
2121
+
2222
WARNING: Graceful shutdown is not implemented for all products yet. Please check the documentation specific to the product operator to see if it is supported (such as e.g. xref:trino:usage-guide/operations/graceful-shutdown.adoc[the documentation for Trino].
2323

0 commit comments

Comments
 (0)