-
-
Notifications
You must be signed in to change notification settings - Fork 15
Prepare shared code: Support running on non default dns settings #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
ab8e419
e972aee
96ff314
c699b9c
68f9d6a
dd2d672
084e1d2
06b5aac
b0d3de4
e053479
3d5c01e
53d8c9f
32e41d8
f7537c7
d3bd713
5da95d7
c4b22b5
ab93d4e
abf10f9
cc570fe
954bbbc
db8a7c9
da290fa
59ca78b
23697ae
816b9ef
c90cee8
d0ab020
c686568
e9afa5a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
nameserver 10.243.21.53 | ||
options ndots:5 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
search baz svc.foo.bar foo.bar | ||
search sble-operators.svc.cluster.local svc.cluster.local cluster.local | ||
nameserver 10.243.21.53 | ||
options ndots:5 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
search sble-operators.svc.cluster.local svc.cluster.local cluster.local | ||
nameserver 10.243.21.53 | ||
options ndots:5 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
search openshift-service-ca-operator.svc.cluster.local svc.cluster.local cluster.local cmx.repl-openshift.build | ||
nameserver 172.30.0.10 | ||
options ndots:5 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,187 @@ | ||
use std::{env, path::PathBuf, str::FromStr, sync::OnceLock}; | ||
|
||
use snafu::{ResultExt, Snafu}; | ||
use tracing::instrument; | ||
|
||
use crate::commons::networking::DomainName; | ||
|
||
const KUBERNETES_CLUSTER_DOMAIN_ENV: &str = "KUBERNETES_CLUSTER_DOMAIN"; | ||
const KUBERNETES_SERVICE_HOST_ENV: &str = "KUBERNETES_SERVICE_HOST"; | ||
|
||
const KUBERNETES_CLUSTER_DOMAIN_DEFAULT: &str = "cluster.local"; | ||
const RESOLVE_CONF_FILE_PATH: &str = "/etc/resolv.conf"; | ||
|
||
#[derive(Debug, Snafu)] | ||
pub enum Error { | ||
#[snafu(display("failed to read resolv config from {RESOLVE_CONF_FILE_PATH:?}"))] | ||
ReadResolvConfFile { source: std::io::Error }, | ||
|
||
#[snafu(display("failed to parse {cluster_domain:?} as domain name"))] | ||
ParseDomainName { | ||
NickLarsenNZ marked this conversation as resolved.
Show resolved
Hide resolved
|
||
source: crate::validation::Errors, | ||
cluster_domain: String, | ||
}, | ||
|
||
#[snafu(display(r#"unable to find "search" entry"#))] | ||
NoSearchEntry, | ||
|
||
#[snafu(display(r#"unable to find unambiguous domain in "search" entry"#))] | ||
AmbiguousDomainEntries, | ||
} | ||
|
||
/// Tries to retrieve the Kubernetes cluster domain. | ||
/// | ||
/// 1. Return `KUBERNETES_CLUSTER_DOMAIN` if set, otherwise | ||
/// 2. Return the cluster domain parsed from the `/etc/resolv.conf` file if `KUBERNETES_SERVICE_HOST` | ||
/// is set, otherwise fall back to `cluster.local`. | ||
/// | ||
/// This variable is initialized in [`crate::client::initialize_operator`], which is called in the | ||
/// main function. It can be used as suggested below. | ||
/// | ||
/// ## Usage | ||
/// | ||
/// ```no_run | ||
/// use stackable_operator::utils::cluster_domain::KUBERNETES_CLUSTER_DOMAIN; | ||
/// | ||
/// let kubernetes_cluster_domain = KUBERNETES_CLUSTER_DOMAIN.get() | ||
/// .expect("KUBERNETES_CLUSTER_DOMAIN must first be set by calling initialize_operator"); | ||
/// | ||
/// tracing::info!(%kubernetes_cluster_domain, "Found cluster domain"); | ||
/// ``` | ||
/// | ||
/// ## See | ||
/// | ||
/// - <https://github.com/stackabletech/issues/issues/436> | ||
/// - <https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/> | ||
pub static KUBERNETES_CLUSTER_DOMAIN: OnceLock<DomainName> = OnceLock::new(); | ||
|
||
#[instrument] | ||
pub(crate) fn retrieve_cluster_domain() -> Result<DomainName, Error> { | ||
// 1. Read KUBERNETES_CLUSTER_DOMAIN env var | ||
tracing::debug!("Trying to determine the Kubernetes cluster domain..."); | ||
|
||
match env::var(KUBERNETES_CLUSTER_DOMAIN_ENV) { | ||
Ok(cluster_domain) if !cluster_domain.is_empty() => { | ||
let cluster_domain = DomainName::from_str(&cluster_domain) | ||
.context(ParseDomainNameSnafu { cluster_domain })?; | ||
tracing::info!( | ||
%cluster_domain, | ||
"Using Kubernetes cluster domain from {KUBERNETES_CLUSTER_DOMAIN_ENV:?} environment variable" | ||
); | ||
return Ok(cluster_domain); | ||
} | ||
_ => {} | ||
}; | ||
|
||
// 2. If no env var is set, check if we run in a clustered (Kubernetes/Openshift) environment | ||
// by checking if KUBERNETES_SERVICE_HOST is set: If not default to 'cluster.local'. | ||
tracing::debug!( | ||
"Trying to determine the operator runtime environment as environment variable \ | ||
{KUBERNETES_CLUSTER_DOMAIN_ENV:?} is not set" | ||
); | ||
|
||
match env::var(KUBERNETES_SERVICE_HOST_ENV) { | ||
Ok(_) => { | ||
let cluster_domain = retrieve_cluster_domain_from_resolv_conf(RESOLVE_CONF_FILE_PATH)?; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we're going to do this hack (and I still don't think we should), can we please at least punt that until vNext instead of sneaking it in just before the release? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have many customers that require this, i think there is no way to punt on this for now, but calls for intensive testing for sure! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can punt on autodetection while still keeping the override. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We removed auto-detection in #896 again |
||
let cluster_domain = DomainName::from_str(&cluster_domain) | ||
.context(ParseDomainNameSnafu { cluster_domain })?; | ||
NickLarsenNZ marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
tracing::info!( | ||
%cluster_domain, | ||
"Using Kubernetes cluster domain from {RESOLVE_CONF_FILE_PATH:?} file" | ||
); | ||
|
||
Ok(cluster_domain) | ||
} | ||
Err(_) => { | ||
let cluster_domain = DomainName::from_str(KUBERNETES_CLUSTER_DOMAIN_DEFAULT) | ||
.expect("KUBERNETES_CLUSTER_DOMAIN_DEFAULT constant must a valid domain"); | ||
|
||
tracing::info!( | ||
%cluster_domain, | ||
"Could not determine Kubernetes cluster domain as the operator is not running within Kubernetes, assuming default Kubernetes cluster domain" | ||
); | ||
|
||
Ok(cluster_domain) | ||
} | ||
} | ||
} | ||
|
||
#[instrument] | ||
fn retrieve_cluster_domain_from_resolv_conf( | ||
path: impl Into<PathBuf> + std::fmt::Debug, | ||
) -> Result<String, Error> { | ||
let path = path.into(); | ||
let content = std::fs::read_to_string(&path) | ||
.inspect_err(|error| { | ||
tracing::error!(%error, path = %path.display(), "Cannot read resolv conf"); | ||
}) | ||
.context(ReadResolvConfFileSnafu)?; | ||
|
||
// If there are multiple search directives, only the search | ||
// man 5 resolv.conf | ||
let Some(last_search_entry) = content | ||
.lines() | ||
.rev() | ||
.map(|l| l.trim()) | ||
.find(|&l| l.starts_with("search")) | ||
.map(|l| l.trim_start_matches("search").trim()) | ||
else { | ||
return NoSearchEntrySnafu.fail(); | ||
}; | ||
|
||
let Some(shortest_entry) = last_search_entry | ||
.split_ascii_whitespace() | ||
.min_by_key(|item| item.len()) | ||
else { | ||
return AmbiguousDomainEntriesSnafu.fail(); | ||
}; | ||
|
||
// NOTE (@Techassi): This is really sad and bothers me more than I would like to admit. This | ||
// clone could be removed by using the code directly in the calling function. But that would | ||
// remove the possibility to easily test the parsing. | ||
Ok(shortest_entry.to_owned()) | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use std::path::PathBuf; | ||
|
||
use super::*; | ||
use rstest::rstest; | ||
|
||
#[test] | ||
fn use_different_kubernetes_cluster_domain_value() { | ||
let cluster_domain = "my-cluster.local".to_string(); | ||
|
||
// set different domain via env var | ||
unsafe { | ||
env::set_var(KUBERNETES_CLUSTER_DOMAIN_ENV, &cluster_domain); | ||
} | ||
|
||
// initialize the lock | ||
let _ = KUBERNETES_CLUSTER_DOMAIN.set(retrieve_cluster_domain().unwrap()); | ||
|
||
assert_eq!( | ||
cluster_domain, | ||
KUBERNETES_CLUSTER_DOMAIN.get().unwrap().to_string() | ||
); | ||
} | ||
|
||
#[rstest] | ||
fn parse_resolv_conf_pass( | ||
#[files("fixtures/cluster_domain/pass/*.resolv.conf")] path: PathBuf, | ||
) { | ||
assert_eq!( | ||
retrieve_cluster_domain_from_resolv_conf(path).unwrap(), | ||
KUBERNETES_CLUSTER_DOMAIN_DEFAULT | ||
); | ||
} | ||
|
||
#[rstest] | ||
fn parse_resolv_conf_fail( | ||
#[files("fixtures/cluster_domain/fail/*.resolv.conf")] path: PathBuf, | ||
) { | ||
assert!(retrieve_cluster_domain_from_resolv_conf(path).is_err()); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
pub mod bash; | ||
pub mod cluster_domain; | ||
pub mod crds; | ||
pub mod logging; | ||
mod option; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a global?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its supposed to be the new "entry point" for operators. There will be more content in there in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's basically renaming
create_client
toinitialize_operator
to be more expressive. By doing this we enforce that operators need to go throw this function and therefore initialize the OnceLockUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it.. isn't? It says less about what's happening, nor does it explain why this place would be the entry point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such as?
This still doesn't explain why it needs to be a
OnceLock
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have branch where I warn on an unsupported Kubernetes version. It's just not merged yet because of prioritization. In the future it may also read in the recommended product version or some global config object or whatnot or check if there is an update available.
Agreed, I didn't try do explain that. Do you have an alternative suggestion how we can make this easy-to-use in operators? It looks very intuitive to me, but I'm not a Rust expert!
In the beginning we had a LazyLock, but @Techassi mentioned that we would panic on e.g. an invalid DomainName being configured, and that the operators should error instead of panic, so we needed to change to a OnceLock.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That still sounds like a part of.. creating a k8s client.
I would have said you could store it in the
Client
, but on second thought maybe it would make more sense to create a separateKubernetesClusterConfig
struct.LazyLock
/OnceLock
is for caching expensive static values that can't beconst
for whatever reason (like a precomputedHashMap
), not for configuration.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We replaced the LacyLock with a KubernetesClusterInfo struct in #896