Skip to content

Commit dad96db

Browse files
sbernauerJimvinsiegfriedweber
authored
feat: Support graceful shutdown (#407)
* feat: Support graceful shutdown * update docs * docs * changelog * link code in docs * increase default of datanodes to 30 min * move into constants * use new operator-rs * docs: Format 15 minutes * Use new operator-rs * improve docs * fix link * use operator-rs 0.55.0 * fixup * improve docs * set error context * Added a high level description of graceful shutdown * Revert "Added a high level description of graceful shutdown" This reverts commit 7733ec1. Moved to stackabletech/documentation#473 * Move rustdoc above field attributes * Avoid snafu context(false) * docs wording * newline * fix: Vector graceful shutdown * downgrade ring again * fix links * use new operator-rs * chore: Bump operator-rs to 0.56.0 * Revert "chore: Bump operator-rs to 0.56.0" This reverts commit 4e14d57. * Update docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdown.adoc Co-authored-by: Siegfried Weber <mail@siegfriedweber.net> --------- Co-authored-by: Jim Halfpenny <jim@source321.com> Co-authored-by: Siegfried Weber <mail@siegfriedweber.net>
1 parent e459079 commit dad96db

File tree

10 files changed

+160
-9
lines changed

10 files changed

+160
-9
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ All notable changes to this project will be documented in this file.
99
- Default stackableVersion to operator version ([#381]).
1010
- Configuration overrides for the JVM security properties, such as DNS caching ([#384]).
1111
- Support PodDisruptionBudgets ([#394]).
12+
- Support graceful shutdown ([#407]).
1213
- Added support for 3.2.4, 3.3.6 ([#409]).
1314

1415
### Changed
@@ -33,6 +34,7 @@ All notable changes to this project will be documented in this file.
3334
[#402]: https://github.com/stackabletech/hdfs-operator/pull/402
3435
[#404]: https://github.com/stackabletech/hdfs-operator/pull/404
3536
[#405]: https://github.com/stackabletech/hdfs-operator/pull/405
37+
[#407]: https://github.com/stackabletech/hdfs-operator/pull/407
3638
[#409]: https://github.com/stackabletech/hdfs-operator/pull/409
3739

3840
## [23.7.0] - 2023-07-14

deploy/helm/hdfs-operator/crds/crds.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -576,6 +576,10 @@ spec:
576576
type: array
577577
type: object
578578
type: object
579+
gracefulShutdownTimeout:
580+
description: Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
581+
nullable: true
582+
type: string
579583
logging:
580584
default:
581585
enableVectorAgent: null
@@ -4069,6 +4073,10 @@ spec:
40694073
type: array
40704074
type: object
40714075
type: object
4076+
gracefulShutdownTimeout:
4077+
description: Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
4078+
nullable: true
4079+
type: string
40724080
logging:
40734081
default:
40744082
enableVectorAgent: null
@@ -7621,6 +7629,10 @@ spec:
76217629
type: array
76227630
type: object
76237631
type: object
7632+
gracefulShutdownTimeout:
7633+
description: Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
7634+
nullable: true
7635+
type: string
76247636
logging:
76257637
default:
76267638
enableVectorAgent: null
@@ -11105,6 +11117,10 @@ spec:
1110511117
type: array
1110611118
type: object
1110711119
type: object
11120+
gracefulShutdownTimeout:
11121+
description: Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
11122+
nullable: true
11123+
type: string
1110811124
logging:
1110911125
default:
1111011126
enableVectorAgent: null
@@ -14606,6 +14622,10 @@ spec:
1460614622
type: array
1460714623
type: object
1460814624
type: object
14625+
gracefulShutdownTimeout:
14626+
description: Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
14627+
nullable: true
14628+
type: string
1460914629
logging:
1461014630
default:
1461114631
enableVectorAgent: null
@@ -18090,6 +18110,10 @@ spec:
1809018110
type: array
1809118111
type: object
1809218112
type: object
18113+
gracefulShutdownTimeout:
18114+
description: Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
18115+
nullable: true
18116+
type: string
1809318117
logging:
1809418118
default:
1809518119
enableVectorAgent: null
Lines changed: 34 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,37 @@
11
= Graceful shutdown
22

3-
Graceful shutdown of HDFS nodes is either not supported by the product itself
4-
or we have not implemented it yet.
3+
You can configure the graceful shutdown as described in xref:concepts:operations/graceful_shutdown.adoc[].
54

6-
Outstanding implementation work for the graceful shutdowns of all products where this functionality is relevant is tracked in https://github.com/stackabletech/issues/issues/357
5+
== JournalNodes
6+
7+
As a default, JournalNodes have `15 minutes` to shut down gracefully.
8+
9+
The JournalNode process will always run as PID `1` and will receive a `SIGTERM` signal when Kubernetes wants to terminate the Pod.
10+
It will log the received signal as shown in the log below and initiate a graceful shutdown.
11+
After the graceful shutdown timeout runs out, and the process still didn't exit, Kubernetes will issue a `SIGKILL` signal.
12+
13+
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java#L272[This] is the relevant code that gets executed in the JournalNodes as of HDFS version `3.3.4`.
14+
15+
16+
[source,text]
17+
----
18+
2023-10-10 13:37:41,525 ERROR server.JournalNode (LogAdapter.java:error(75)) - RECEIVED SIGNAL 15: SIGTERM
19+
2023-10-10 13:37:41,526 INFO server.JournalNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
20+
/************************************************************
21+
SHUTDOWN_MSG: Shutting down JournalNode at hdfs-journalnode-default-0/10.244.0.38
22+
************************************************************/
23+
----
24+
25+
== NameNodes
26+
27+
As a default, NameNodes have `15 minutes` to shut down gracefully.
28+
They use the same mechanism described above.
29+
30+
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java#L1080[This] is the relevant code that gets executed in the NameNodes as of HDFS version `3.3.4`.
31+
32+
== DataNodes
33+
34+
As a default, DataNodes have `30 minutes` to shut down gracefully.
35+
They use the same mechanism described above.
36+
37+
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L2004[This] is the relevant code that gets executed in the DataNodes as of HDFS version `3.3.4`.

rust/crd/src/constants.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
use stackable_operator::time::Duration;
2+
13
pub const DEFAULT_DFS_REPLICATION_FACTOR: u8 = 3;
24

35
pub const CONTROLLER_NAME: &str = "hdfsclusters.hdfs.stackable.tech";
@@ -41,6 +43,13 @@ pub const DEFAULT_JOURNAL_NODE_HTTP_PORT: u16 = 8480;
4143
pub const DEFAULT_JOURNAL_NODE_HTTPS_PORT: u16 = 8481;
4244
pub const DEFAULT_JOURNAL_NODE_RPC_PORT: u16 = 8485;
4345

46+
pub const DEFAULT_JOURNAL_NODE_GRACEFUL_SHUTDOWN_TIMEOUT: Duration =
47+
Duration::from_minutes_unchecked(15);
48+
pub const DEFAULT_NAME_NODE_GRACEFUL_SHUTDOWN_TIMEOUT: Duration =
49+
Duration::from_minutes_unchecked(15);
50+
pub const DEFAULT_DATA_NODE_GRACEFUL_SHUTDOWN_TIMEOUT: Duration =
51+
Duration::from_minutes_unchecked(30);
52+
4453
// hdfs-site.xml
4554
pub const DFS_NAMENODE_NAME_DIR: &str = "dfs.namenode.name.dir";
4655
pub const DFS_NAMENODE_SHARED_EDITS_DIR: &str = "dfs.namenode.shared.edits.dir";

rust/crd/src/lib.rs

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ use stackable_operator::{
3030
role_utils::{GenericRoleConfig, Role, RoleGroup, RoleGroupRef},
3131
schemars::{self, JsonSchema},
3232
status::condition::{ClusterCondition, HasStatusCondition},
33+
time::Duration,
3334
};
3435
use strum::{Display, EnumIter, EnumString};
3536

@@ -160,6 +161,7 @@ pub trait MergedConfig {
160161
None
161162
}
162163
fn affinity(&self) -> &StackableAffinity;
164+
fn graceful_shutdown_timeout(&self) -> Option<&Duration>;
163165
/// Main container shared by all roles
164166
fn hdfs_logging(&self) -> ContainerLogConfig;
165167
/// Vector container shared by all roles
@@ -845,6 +847,9 @@ pub struct NameNodeConfig {
845847
pub logging: Logging<NameNodeContainer>,
846848
#[fragment_attrs(serde(default))]
847849
pub affinity: StackableAffinity,
850+
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
851+
#[fragment_attrs(serde(default))]
852+
pub graceful_shutdown_timeout: Option<Duration>,
848853
}
849854

850855
impl MergedConfig for NameNodeConfig {
@@ -856,6 +861,10 @@ impl MergedConfig for NameNodeConfig {
856861
&self.affinity
857862
}
858863

864+
fn graceful_shutdown_timeout(&self) -> Option<&Duration> {
865+
self.graceful_shutdown_timeout.as_ref()
866+
}
867+
859868
fn hdfs_logging(&self) -> ContainerLogConfig {
860869
self.logging
861870
.containers
@@ -920,6 +929,7 @@ impl NameNodeConfigFragment {
920929
},
921930
logging: product_logging::spec::default_logging(),
922931
affinity: get_affinity(cluster_name, role),
932+
graceful_shutdown_timeout: Some(DEFAULT_NAME_NODE_GRACEFUL_SHUTDOWN_TIMEOUT),
923933
}
924934
}
925935
}
@@ -1005,6 +1015,9 @@ pub struct DataNodeConfig {
10051015
pub logging: Logging<DataNodeContainer>,
10061016
#[fragment_attrs(serde(default))]
10071017
pub affinity: StackableAffinity,
1018+
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
1019+
#[fragment_attrs(serde(default))]
1020+
pub graceful_shutdown_timeout: Option<Duration>,
10081021
}
10091022

10101023
impl MergedConfig for DataNodeConfig {
@@ -1018,6 +1031,10 @@ impl MergedConfig for DataNodeConfig {
10181031
&self.affinity
10191032
}
10201033

1034+
fn graceful_shutdown_timeout(&self) -> Option<&Duration> {
1035+
self.graceful_shutdown_timeout.as_ref()
1036+
}
1037+
10211038
fn hdfs_logging(&self) -> ContainerLogConfig {
10221039
self.logging
10231040
.containers
@@ -1073,6 +1090,7 @@ impl DataNodeConfigFragment {
10731090
},
10741091
logging: product_logging::spec::default_logging(),
10751092
affinity: get_affinity(cluster_name, role),
1093+
graceful_shutdown_timeout: Some(DEFAULT_DATA_NODE_GRACEFUL_SHUTDOWN_TIMEOUT),
10761094
}
10771095
}
10781096
}
@@ -1156,6 +1174,9 @@ pub struct JournalNodeConfig {
11561174
pub logging: Logging<JournalNodeContainer>,
11571175
#[fragment_attrs(serde(default))]
11581176
pub affinity: StackableAffinity,
1177+
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
1178+
#[fragment_attrs(serde(default))]
1179+
pub graceful_shutdown_timeout: Option<Duration>,
11591180
}
11601181

11611182
impl MergedConfig for JournalNodeConfig {
@@ -1167,6 +1188,10 @@ impl MergedConfig for JournalNodeConfig {
11671188
&self.affinity
11681189
}
11691190

1191+
fn graceful_shutdown_timeout(&self) -> Option<&Duration> {
1192+
self.graceful_shutdown_timeout.as_ref()
1193+
}
1194+
11701195
fn hdfs_logging(&self) -> ContainerLogConfig {
11711196
self.logging
11721197
.containers
@@ -1210,6 +1235,7 @@ impl JournalNodeConfigFragment {
12101235
},
12111236
logging: product_logging::spec::default_logging(),
12121237
affinity: get_affinity(cluster_name, role),
1238+
graceful_shutdown_timeout: Some(DEFAULT_JOURNAL_NODE_GRACEFUL_SHUTDOWN_TIMEOUT),
12131239
}
12141240
}
12151241
}

rust/operator/src/container.rs

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,13 @@ use stackable_hdfs_crd::{
2828
storage::DataNodeStorageConfig,
2929
DataNodeContainer, HdfsCluster, HdfsPodRef, HdfsRole, MergedConfig, NameNodeContainer,
3030
};
31-
use stackable_operator::builder::SecretFormat;
31+
use stackable_operator::{
32+
builder::SecretFormat,
33+
product_logging::framework::{
34+
create_vector_shutdown_file_command, remove_vector_shutdown_file_command,
35+
},
36+
utils::COMMON_BASH_TRAP_FUNCTIONS,
37+
};
3238
use stackable_operator::{
3339
builder::{
3440
resources::ResourceRequirementsBuilder, ContainerBuilder, PodBuilder,
@@ -45,9 +51,12 @@ use stackable_operator::{
4551
},
4652
kube::ResourceExt,
4753
memory::{BinaryMultiple, MemoryQuantity},
48-
product_logging,
49-
product_logging::spec::{
50-
ConfigMapLogConfig, ContainerLogConfig, ContainerLogConfigChoice, CustomContainerLogConfig,
54+
product_logging::{
55+
self,
56+
spec::{
57+
ConfigMapLogConfig, ContainerLogConfig, ContainerLogConfigChoice,
58+
CustomContainerLogConfig,
59+
},
5160
},
5261
};
5362
use std::{collections::BTreeMap, str::FromStr};
@@ -437,9 +446,21 @@ impl ContainerConfig {
437446
HDFS_LOG4J_CONFIG_FILE,
438447
merged_config.hdfs_logging(),
439448
));
449+
440450
args.push_str(&format!(
441-
"{hadoop_home}/bin/hdfs {role}\n",
451+
"\
452+
{COMMON_BASH_TRAP_FUNCTIONS}
453+
{remove_vector_shutdown_file_command}
454+
prepare_signal_handlers
455+
{hadoop_home}/bin/hdfs {role} &
456+
wait_for_termination $!
457+
{create_vector_shutdown_file_command}
458+
",
442459
hadoop_home = Self::HADOOP_HOME,
460+
remove_vector_shutdown_file_command =
461+
remove_vector_shutdown_file_command(STACKABLE_LOG_DIR),
462+
create_vector_shutdown_file_command =
463+
create_vector_shutdown_file_command(STACKABLE_LOG_DIR),
443464
));
444465
}
445466
ContainerConfig::Zkfc { .. } => {

rust/operator/src/hdfs_controller.rs

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,10 @@ use crate::{
5454
discovery::build_discovery_configmap,
5555
event::{build_invalid_replica_message, publish_event},
5656
kerberos,
57-
operations::pdb::add_pdbs,
57+
operations::{
58+
graceful_shutdown::{self, add_graceful_shutdown_config},
59+
pdb::add_pdbs,
60+
},
5861
product_logging::{extend_role_group_config_map, resolve_vector_aggregator_address},
5962
OPERATOR_NAME,
6063
};
@@ -204,6 +207,9 @@ pub enum Error {
204207
source: PropertiesWriterError,
205208
rolegroup: String,
206209
},
210+
211+
#[snafu(display("failed to configure graceful shutdown"))]
212+
GracefulShutdown { source: graceful_shutdown::Error },
207213
}
208214

209215
impl ReconcilerError for Error {
@@ -694,6 +700,8 @@ fn rolegroup_statefulset(
694700
)
695701
.context(FailedToCreateContainerAndVolumeConfigurationSnafu)?;
696702

703+
add_graceful_shutdown_config(merged_config, &mut pb).context(GracefulShutdownSnafu)?;
704+
697705
let mut pod_template = pb.build_template();
698706
if let Some(pod_overrides) = hdfs.pod_overrides_for_role(role) {
699707
pod_template.merge_from(pod_overrides.clone());
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
use snafu::{ResultExt, Snafu};
2+
use stackable_hdfs_crd::MergedConfig;
3+
use stackable_operator::builder::PodBuilder;
4+
5+
#[derive(Debug, Snafu)]
6+
pub enum Error {
7+
#[snafu(display("Failed to set terminationGracePeriod"))]
8+
SetTerminationGracePeriod {
9+
source: stackable_operator::builder::pod::Error,
10+
},
11+
}
12+
13+
pub fn add_graceful_shutdown_config(
14+
merged_config: &(dyn MergedConfig + Send + 'static),
15+
pod_builder: &mut PodBuilder,
16+
) -> Result<(), Error> {
17+
// This must be always set by the merge mechanism, as we provide a default value,
18+
// users can not disable graceful shutdown.
19+
if let Some(graceful_shutdown_timeout) = merged_config.graceful_shutdown_timeout() {
20+
pod_builder
21+
.termination_grace_period(graceful_shutdown_timeout)
22+
.context(SetTerminationGracePeriodSnafu)?;
23+
}
24+
25+
Ok(())
26+
}

rust/operator/src/operations/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1+
pub mod graceful_shutdown;
12
pub mod pdb;

tests/templates/kuttl/smoke/30-assert.yaml.j2

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ spec:
2323
- name: vector
2424
{% endif %}
2525
- name: zkfc
26+
terminationGracePeriodSeconds: 900
2627
status:
2728
readyReplicas: 2
2829
replicas: 2
@@ -46,6 +47,7 @@ spec:
4647
{% if lookup('env', 'VECTOR_AGGREGATOR') %}
4748
- name: vector
4849
{% endif %}
50+
terminationGracePeriodSeconds: 900
4951
status:
5052
readyReplicas: 1
5153
replicas: 1
@@ -69,6 +71,7 @@ spec:
6971
{% if lookup('env', 'VECTOR_AGGREGATOR') %}
7072
- name: vector
7173
{% endif %}
74+
terminationGracePeriodSeconds: 1800
7275
status:
7376
readyReplicas: {{ test_scenario['values']['number-of-datanodes'] }}
7477
replicas: {{ test_scenario['values']['number-of-datanodes'] }}

0 commit comments

Comments
 (0)