Skip to content

Commit 70b9f12

Browse files
committed
Adjust formatting
1 parent 806a4a8 commit 70b9f12

File tree

1 file changed

+84
-44
lines changed

1 file changed

+84
-44
lines changed

modules/contributor/pages/adr/ADR028-discovery-revision.adoc

Lines changed: 84 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -16,28 +16,33 @@ v0.1, 2023-03-30
1616
1717
== Context and Problem Statement
1818

19-
// Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question.
19+
// Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to
20+
// articulate the problem in form of a question.
2021

21-
This ADR is written with specific problems in mind, but the goal is to reach a generic mechanism for the discovery of products.
22-
The current discovery mechanism is described https://docs.stackable.tech/home/stable/concepts/service_discovery.html[in our docs] (make sure to pick the 23.1 version).
22+
This ADR is written with specific problems in mind, but the goal is to reach a generic mechanism for the discovery of
23+
products. The current discovery mechanism is described
24+
https://docs.stackable.tech/home/stable/concepts/service_discovery.html[in our docs] (make sure to pick the 23.1
25+
version).
2326

24-
Basically when clients connect to products managed by Stackable, they need to have certain information about how to connect to these products.
25-
Currently we expose some of these information, but not all (e.g. which ca cert the exposed product uses).
26-
On the other hand it could be the case that users have external services which Stackable services should use, e.g.
27-
an non-stackable HDFS where an Stackable Trino should connect to.
27+
Basically when clients connect to products managed by Stackable, they need to have certain information about how to
28+
connect to these products. Currently we expose some of these information, but not all (e.g. which ca cert the exposed
29+
product uses). On the other hand it could be the case that users have external services which Stackable services should
30+
use, e.g. an non-stackable HDFS where an Stackable Trino should connect to.
2831

29-
Question 1: How do we want to make products discoverable?
30-
Question 2: How should clients verify the identity of the server (e.g. the cert of the server)
31-
Question 3: How should clients authenticate themselves against the server?
32+
*Question 1:* How do we want to make products discoverable?
33+
*Question 2:* How should clients verify the identity of the server (e.g. the cert of the server)
34+
*Question 3:* How should clients authenticate themselves against the server?
3235

3336
== Decision Drivers
37+
3438
We have some common use-cases that we need to express via the discovery mechanism:
3539

3640
1. Trino cluster
3741
* Currently we don't write any discovery CM
3842
* We need at least
3943
** Coordinator address (currently only single coordinator is supported)
40-
*** Ideally split it into the following attributes, so that clients can construct the URI. (trino python client e.g. needs all of these attributes separately and I don't want to rely on parsing and extracting the URI)
44+
*** Ideally split it into the following attributes, so that clients can construct the URI. (trino python client e.g.
45+
needs all of these attributes separately and I don't want to rely on parsing and extracting the URI)
4146
**** protocol (http/https)
4247
**** host
4348
**** port
@@ -91,15 +96,16 @@ spec:
9196

9297
==== Pros
9398

94-
* Easier to use for consuming applications outside of Stackable, as they can simply mount the CM or download it to a file.
95-
On the other hand when we start to put in yaml objects that need to be parsed we would loose the benefit.
99+
* Easier to use for consuming applications outside of Stackable, as they can simply mount the CM or download it to a
100+
file. On the other hand when we start to put in yaml objects that need to be parsed we would loose the benefit.
96101
* Single API call to retrieve all running products
97102

98103
==== Cons
99104

100-
* No complex structure such as enums (we can mitigate this by sticking in custom yaml into the CM).
101-
Users don't have any form of validation when creating their own discovery CM e.g. pointing to their existing HDFS.
102-
* Cannot have two products with the same name, as the discovery CM name clashes. One solution could be to prefix the product name (e.g. trino-simple), This can impose other problems such as too long CM names.
105+
* No complex structure such as enums (we can mitigate this by sticking in custom yaml into the CM). Users don't have any
106+
form of validation when creating their own discovery CM e.g. pointing to their existing HDFS.
107+
* Cannot have two products with the same name, as the discovery CM name clashes. One solution could be to prefix the
108+
product name (e.g. trino-simple), This can impose other problems such as too long CM names.
103109

104110
=== [1] Discovery Object: Use dedicated CRD object for every product
105111

@@ -130,15 +136,23 @@ spec:
130136
==== Pros
131137

132138
* Validation by using e.g. complex enums
133-
* Commons structure can be shared between all operators, such as `Listener` endpoints or tls server certificate information
139+
* Commons structure can be shared between all operators, such as `Listener` endpoints or tls server certificate
140+
information
134141

135142
==== Cons
136143

137-
* Operator A needs to compile against operator b to have access to it's discovery struct. An alternative would be to put the Discovery CRDs in operator-rs.
138-
* Operator versioning hell. On the other hand we have the same problem with ConfigMaps, as e.g. a newly introduced key is missing because of an older hdfs operator version.
139-
* Dependant Pods (such as hbase on hdfs) can not simply mount a CM containing the hdfs-site and core-site. Instead the hbase-operator needs to read the HdfsClusterDiscovery, copy the hdfs-site and core-site into a CM and than mount that into the hbase Pods. This can be solved by the HdfsClusterDiscovery to point to a CM that contains hdfs-site and core-site xmls.
140-
* Multiple API calls need to retrieve all running Stackable service (in stackablectl or cockpit). This would be a single API call in case of discovery CM or a shared CRD for all product discoveries.
141-
* Side-Note: `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are external systems, where we don't know anything about.
144+
* Operator A needs to compile against operator b to have access to it's discovery struct. An alternative would be to put
145+
the Discovery CRDs in operator-rs.
146+
* Operator versioning hell. On the other hand we have the same problem with ConfigMaps, as e.g. a newly introduced key
147+
is missing because of an older hdfs operator version.
148+
* Dependant Pods (such as hbase on hdfs) can not simply mount a CM containing the hdfs-site and core-site. Instead the
149+
hbase-operator needs to read the HdfsClusterDiscovery, copy the hdfs-site and core-site into a CM and than mount that
150+
into the hbase Pods. This can be solved by the HdfsClusterDiscovery to point to a CM that contains hdfs-site and
151+
core-site xmls.
152+
* Multiple API calls need to retrieve all running Stackable service (in stackablectl or cockpit). This would be a single
153+
API call in case of discovery CM or a shared CRD for all product discoveries.
154+
* Side-Note: `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are
155+
external systems, where we don't know anything about.
142156

143157
=== [1] Discovery Object: Use dedicated CRD object for every product - in combination with ConfigMap
144158

@@ -234,27 +248,34 @@ spec:
234248
==== Pros
235249

236250
* Only one struct in operator-rs => No cross-operator dependencies.
237-
* Single API call to retrieve all stackable products. Question is if this really helps a lot, as callers probably also are interested in the status of the product, which needs further API calls (irrelevant - see Cons).
251+
* Single API call to retrieve all stackable products. Question is if this really helps a lot, as callers probably also
252+
are interested in the status of the product, which needs further API calls (irrelevant - see Cons).
238253

239254
==== Cons
240255

241-
* All product discoveries are versioned together. E.g. a new mandatory field for hdfs requires all operators to bump the Discovery CRD to `v2`. We hope that this does not happen too often.
256+
* All product discoveries are versioned together. E.g. a new mandatory field for hdfs requires all operators to bump the
257+
Discovery CRD to `v2`. We hope that this does not happen too often.
242258
* Names can collide
243-
* `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are external systems, where we don't know anything about. So in case we want to introduce a `Stacklet` object listing anyway, so the `Pro` regarding the API calls is irrelevant.
259+
* `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are external
260+
systems, where we don't know anything about. So in case we want to introduce a `Stacklet` object listing anyway, so
261+
the `Pro` regarding the API calls is irrelevant.
244262

245263
=== [1] Discovery Object: Write the discovery to Product CR status
246264

247-
Instead of writing discovery information to dedicated objects - such as CM or custom CR - we "simply" write the discovery information to the status of the Cluster CR.
265+
Instead of writing discovery information to dedicated objects - such as CM or custom CR - we "simply" write the
266+
discovery information to the status of the Cluster CR.
248267

249268
==== Pros
250269

251270
==== Cons
252271

253272
* It does not enable users to bring their own product and talk to it from Stackable, e.g. a user-provided HDFS.
254-
* It does not allow things such as a ZNode for Zookeeper as we either use the Zookeeper CR for discovery or we use a ZNode but than can't use a Zookeeper CR.
255-
Currently we have the freedom of either connection to a Zookeeper root dir or a ZNode transparently.
273+
* It does not allow things such as a ZNode for Zookeeper as we either use the Zookeeper CR for discovery or we use a
274+
ZNode but than can't use a Zookeeper CR. Currently we have the freedom of either connection to a Zookeeper root dir or
275+
a ZNode transparently.
256276

257277
=== [2] TLS: Discovery config contains SecretClass
278+
258279
The discovery includes the SecretClass used to obtain the ca.crt used to validate the *server* certificate
259280

260281
Trino discovery:
@@ -290,9 +311,10 @@ backends: # Don't look at the Superset CRD structure, we are only interested in
290311

291312
=== [2] TLS: Client needs to specify SecretClass
292313
---
293-
The discovery does *not* include the SecretClass used to obtain the *server* certificate.
294-
Instead the client must specify which SecretClass should be used to verify the *server* certificate.
295-
For usability reasons it can be omitted and defaults to the SecretClass the client uses for itself.
314+
315+
The discovery does *not* include the SecretClass used to obtain the *server* certificate. Instead the client must
316+
specify which SecretClass should be used to verify the *server* certificate. For usability reasons it can be omitted and
317+
defaults to the SecretClass the client uses for itself.
296318

297319
Trino discovery:
298320
[source,yaml]
@@ -323,13 +345,16 @@ backends: # Don't look at the Superset CRD structure, we are only interested in
323345

324346
==== Pros
325347

326-
* Operator does not need to read/look at the DiscoveryConfig (as we can statically set up the secret-op tls secretClass volumes rather than retrieving them from the DiscoveryConfig).
327-
* Some clients only support a single pki, in that case we could not give the ability to overwrite the secretClass coming from the product itself.
348+
* Operator does not need to read/look at the DiscoveryConfig (as we can statically set up the secret-op tls secretClass
349+
volumes rather than retrieving them from the DiscoveryConfig).
350+
* Some clients only support a single pki, in that case we could not give the ability to overwrite the secretClass coming
351+
from the product itself.
328352

329353
==== Cons
330354

331355
* The client has to know what pki/secretClass the server is using.
332-
* Superset TrinoConnection could not only say "Connect this Superset and this Trino", but need to say "using this ca.crt to validate the Trino server"
356+
* Superset TrinoConnection could not only say "Connect this Superset and this Trino", but need to say "using this ca.crt
357+
to validate the Trino server"
333358

334359
=== [2] TLS: Include caCert in Discovery config
335360

@@ -353,18 +378,22 @@ endpoint:
353378

354379
==== Pros
355380

356-
* Easier for external clients to use as they don't need to know the concept of SecretClasses and don't even need to run withing k8s.
381+
* Easier for external clients to use as they don't need to know the concept of SecretClasses and don't even need to run
382+
withing k8s.
357383
* The client has to *not* know what pki/secretClass the server is using.
358384

359385
==== Cons
360386

361-
* BIG QUESTION: How should the product operator get the ca cert from the SecretClass it uses to get the *server* cert from?
362-
** The secret-op could e.g. offer an HTTP api to fetch the ca.crt of a given SecretClass or e.g. write the ca.crt into the status of a SecretClass
387+
* BIG QUESTION: How should the product operator get the ca cert from the SecretClass it uses to get the *server* cert
388+
from?
389+
** The secret-op could e.g. offer an HTTP api to fetch the ca.crt of a given SecretClass or e.g. write the ca.crt into
390+
the status of a SecretClass
363391

364392

365393
=== [2] TLS: Include SecretClass in discovery, user can override it
366394

367395
Trino discovery:
396+
368397
[source,yaml]
369398
----
370399
apiVersion: trino.stackable.tech/v1alpha1
@@ -413,12 +442,18 @@ authentication:
413442
----
414443

415444
==== Pros
416-
* IMPORTANT: This is the only thing the server can know (how he is verifying client identities). He can not recommend an SecretClass used to obtain the client credentials. E.g. he uses an LDAP AuthenticationClass, there is no way it can now what SecretClass provides credentials accepted by LDAP. (Most cases it will be a user logging into a WebUI and the LDAP credentials of the user are not even stored anywhere but just remembered by the user)
445+
* IMPORTANT: This is the only thing the server can know (how he is verifying client identities). He can not recommend an
446+
SecretClass used to obtain the client credentials. E.g. he uses an LDAP AuthenticationClass, there is no way it can
447+
now what SecretClass provides credentials accepted by LDAP. (Most cases it will be a user logging into a WebUI and the
448+
LDAP credentials of the user are not even stored anywhere but just remembered by the user)
417449

418450
==== Cons
419451

420-
* Operator has to read the AuthenticationClass to determine its type (pw/tls/keytab) and set up the needed volumes and commands.
421-
// * The AuthenticationClass is meant to describe "how should a server verify connecting clients" and re-purpose it to mean "how a client should authenticate itself". Image a user creates a Secret `trino-users` with *only* a ca.crt and a SecretClass `trino-users` on top. The connecting client than has no way of knowing how to get a client cert.
452+
* Operator has to read the AuthenticationClass to determine its type (pw/tls/keytab) and set up the needed volumes and
453+
commands.
454+
// * The AuthenticationClass is meant to describe "how should a server verify connecting clients" and re-purpose it to
455+
// mean "how a client should authenticate itself". Image a user creates a Secret `trino-users` with *only* a ca.crt
456+
// and a SecretClass `trino-users` on top. The connecting client than has no way of knowing how to get a client cert.
422457

423458
=== [3] Authentication: Add SecretClass to Discovery Config
424459

@@ -434,7 +469,8 @@ authentication:
434469
==== Cons
435470

436471
* Operator has to read the SecretClass to determine its type (pw/tls/keytab) and set up the needed volumes and commands.
437-
* Image then SecretClass is of type `k8sSearch`. The connection client (e.g. controlled via superset-operator) than has no idea if he should expect a tls.crd or a keytab when mounting the SecretClass.
472+
* Image then SecretClass is of type `k8sSearch`. The connection client (e.g. controlled via superset-operator) than has
473+
no idea if he should expect a tls.crd or a keytab when mounting the SecretClass.
438474

439475
=== [3] Authentication: Add needed details
440476

@@ -454,7 +490,8 @@ authentication:
454490

455491
==== Pros
456492

457-
* Operator has *not* to read the SecretClass to determine its type (pw/tls/keytab), as the type is already encoded in the Discovery config.
493+
* Operator has *not* to read the SecretClass to determine its type (pw/tls/keytab), as the type is already encoded in
494+
the Discovery config.
458495

459496
==== Cons
460497

@@ -464,7 +501,8 @@ Trino discovery does not provide any information on how to authenticate
464501

465502
==== Cons
466503

467-
* Not viable, as users need to know how to connect, and are not expected to try 50 different auth methods. We need to give them a AuthenticationClass, that says them e.g. what LDAP or PKI is used.
504+
* Not viable, as users need to know how to connect, and are not expected to try 50 different auth methods. We need to
505+
give them a AuthenticationClass, that says them e.g. what LDAP or PKI is used.
468506

469507
== Decision Outcome
470508

@@ -473,7 +511,9 @@ Trino discovery does not provide any information on how to authenticate
473511
[3] Authentication: `Authentication: Add AuthenticationClass to Discovery Config`
474512

475513
=== Appendix A
476-
Let's model a kerberos secured HDFS with the Options "TLS: Include caCert in Discovery config" and "Authentication: Add needed details"
514+
515+
Let's model a kerberos secured HDFS with the Options "TLS: Include caCert in Discovery config" and "Authentication:
516+
Add needed details"
477517

478518
[source,yaml]
479519
----

0 commit comments

Comments
 (0)