Skip to content

Commit d3b6fdd

Browse files
committed
WIP
1 parent a4d160c commit d3b6fdd

File tree

1 file changed

+80
-10
lines changed

1 file changed

+80
-10
lines changed

modules/contributor/pages/adr/ADR028-discovery-revision.adoc

Lines changed: 80 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,13 @@ v0.1, 2023-03-30
1515

1616
// Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question.
1717

18-
This ADR is written with a specific problems in mind, but the goal is to reach a generic mechanism for the discovery of products.
18+
This ADR is written with specific problems in mind, but the goal is to reach a generic mechanism for the discovery of products.
1919
The current discovery mechanism is described https://docs.stackable.tech/home/stable/concepts/service_discovery.html[in our docs] (make sure to pick the 23.1 version).
2020

2121
Basically when clients connect to products managed by Stackable, they need to have certain information about how to connect to these products.
2222
Currently we expose some of these information, but not all (e.g. which ca cert the exposed product uses).
23+
On the other hand it could be the case that users have external services which Stackable services should use, e.g.
24+
an non-stackable HDFS where an Stackable Trino should connect to.
2325

2426
== Decision Drivers
2527
We have some common use-cases that we need to express via the discovery mechanism:
@@ -35,11 +37,11 @@ We have some common use-cases that we need to express via the discovery mechanis
3537
*** In case of https: The SecretClass that provided the cert for the *server*
3638
** What AuthenticationClass must be used to authenticate
3739
*** null (no SecretClass): Means no authentication at all
38-
*** (future) static: One of these plain credentials
40+
*** static: One of these plain credentials
3941
*** tls: provides ca.crt that needs to have signed the *client* certificate
4042
*** ldap: <whatever>
4143
*** (future) kerberos: kdc where you can get a ticket from (together with the realm)
42-
*** (future) oauth: <whatever>
44+
*** (future) oauth/oidc: <whatever>
4345
*** (future) jwt: <whatever>
4446

4547
2. HDFS cluster
@@ -48,9 +50,9 @@ We have some common use-cases that we need to express via the discovery mechanis
4850
** hdfs-site
4951
** core-site
5052
** What AuthenticationClass must be used to authenticate
51-
*** (future) kerberos: kdc where you can get a ticket from (together with the realm)
52-
** The information about rpc encryption is already in the core-site, so need to expose it explicitly
53-
** The information about data encryption is already in the hdfs-site, so need to expose it explicitly
53+
*** kerberos: kdc where you can get a ticket from (together with the realm)
54+
** The information about rpc encryption is already in the core-site, so no need to expose it explicitly
55+
** The information about data encryption is already in the hdfs-site, so no need to expose it explicitly
5456

5557
== Considered Options
5658

@@ -84,11 +86,13 @@ spec:
8486

8587
* Easier to use for consuming applications outside of Stackable, as they can simply mount the CM or download it to a file.
8688
On the other hand when we start to put in yaml objects that need to be parsed we would loose the benefit.
89+
* Single API call to retrieve all running products
8790

8891
==== Cons
8992

9093
* No complex structure such as enums (we can mitigate this by sticking in custom yaml into the CM).
9194
Users don't have any form of validation when creating their own discovery CM e.g. pointing to their existing HDFS.
95+
* Cannot have two products with the same name, as the discovery CM name clashes. One solution could be to prefix the product name (e.g. trino-simple), This can impose other problems such as too long CM names.
9296

9397
=== [1] Discovery Object: Use dedicated CRD object for every product
9498

@@ -111,17 +115,65 @@ spec:
111115
=== BEGIN CERTIFICATE ===
112116
XXX
113117
=== END CERTIFICATE ===
114-
authentication: |
118+
authentication:
115119
kerberos:
116120
secretClass: client-tls # Use this SecretClass to obtain a keytab
117121
----
118122

119123
==== Pros
120124

125+
* Validation by using e.g. complex enums
126+
* Commons structure can be shared between all operators, such as `Listener` endpoints or tls server certificate information
127+
121128
==== Cons
122129

123130
* Operator A needs to compile against operator b to have access to it's discovery struct. An alternative would be to put the Discovery CRDs in operator-rs.
124131
* Operator versioning hell. On the other hand we have the same problem with ConfigMaps, as e.g. a newly introduced key is missing because of an older hdfs operator version.
132+
* Dependant Pods (such as hbase on hdfs) can not simply mount a CM containing the hdfs-site and core-site. Instead the hbase-operator needs to read the HdfsClusterDiscovery, copy the hdfs-site and core-site into a CM and than mount that into the hbase Pods. This can be solved by the HdfsClusterDiscovery to point to a CM that contains hdfs-site and core-site xmls.
133+
* Multiple API calls need to retrieve all running Stackable service (in stackablectl or cockpit). This would be a single API call in case of discovery CM or a shared CRD for all product discoveries.
134+
* Side-Note: `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are external systems, where we don't know anything about.
135+
136+
=== [1] Discovery Object: Use dedicated CRD object for every product - in combination with ConfigMap
137+
138+
Or use a dedicated HdfsClusterDiscovery crd:
139+
140+
[source,yaml]
141+
----
142+
# This struct should *not* contain any information than any client possible wants to mount
143+
# Instead put these kind of information into the CM
144+
#
145+
# This struct resides in a new repo stackable-discovery and is pulled in as a dependency in (possibly) operator-rs and all operators.
146+
apiVersion: hdfs.stackable.tech/v1alpha1
147+
kind: HdfsClusterDiscovery
148+
metadata:
149+
name: simple-hdfs
150+
spec:
151+
hdfsSitesConfigMap: hdfs-simple-hdfs
152+
productVersion: 3.3.4
153+
httpProtocol:
154+
http: {}
155+
# OR
156+
https:
157+
caSecretClass: tls
158+
authentication:
159+
kerberos:
160+
keytabSecretClass: client-tls # Use this SecretClass to obtain a keytab
161+
---
162+
apiVersion: v1
163+
kind: ConfigMap
164+
metadata:
165+
name: hdfs-simple-hdfs # prefix to avoid naming collisions
166+
spec:
167+
data:
168+
hdfs-site.xml: <xml>
169+
core-site.xml: <xml>
170+
----
171+
172+
==== Pros
173+
174+
* Fixes mount problem from `Discovery Object: Use dedicated CRD object for every product`
175+
176+
==== Cons
125177

126178
=== [1] Discovery Object: Use dedicated CRD object shared between all products
127179

@@ -134,16 +186,34 @@ kind: ClusterDiscovery
134186
metadata:
135187
name: simple-hdfs
136188
spec:
137-
# Whatever
189+
productVersion: 3.3.4
190+
hdfs: # same structure as in HdfsClusterDiscovery example
191+
hdfsSitesConfigMap: hdfs-simple-hdfs
192+
httpProtocol:
193+
http: {}
194+
# OR
195+
https:
196+
caSecretClass: tls
197+
authentication:
198+
kerberos:
199+
keytabSecretClass: client-tls # Use this SecretClass to obtain a keytab
200+
# OR
201+
hbase: # Whatever
202+
# OR
203+
zookeeper: # Whatever
204+
# ...
138205
----
139206

140207
==== Pros
141208

142-
* Only one struct in operator-rs -> No cross-operator dependencies.
209+
* Only one struct in operator-rs => No cross-operator dependencies.
210+
* Single API call to retrieve all stackable products. Question is if this really helps a lot, as callers probably also are interested in the status of the product, which needs further API calls (irrelevant - see Cons).
143211

144212
==== Cons
145213

146-
* It does not seem like it's possible to find a common struct all products can agree upon
214+
* All product discoveries are versioned together. E.g. a new mandatory field for hdfs requires all operators to bump the Discovery CRD to `v2`. We hope that this does not happen too often.
215+
* Names can collide
216+
* `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are external systems, where we don't know anything about. So in case we want to introduce a `Stacklet` object listing anyway, so the `Pro` regarding the API calls is irrelevant.
147217

148218
=== [1] Discovery Object: Write the discovery to Product CR status
149219

0 commit comments

Comments
 (0)