Skip to content

Commit b4bab29

Browse files
committed
More formatting
1 parent 70b9f12 commit b4bab29

File tree

1 file changed

+77
-60
lines changed

1 file changed

+77
-60
lines changed

modules/contributor/pages/adr/ADR028-discovery-revision.adoc

Lines changed: 77 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
= ADR028: Discovery revision
1+
= ADR028: Service Discovery (Revision)
22
Sebastian Bernauer <sebastian.bernauer@stackable.tech>
33
v0.1, 2023-03-30
44
:status: draft
@@ -68,9 +68,15 @@ We have some common use-cases that we need to express via the discovery mechanis
6868

6969
== Considered Options
7070

71-
=== [1] Discovery Object: Use ConfigMap
71+
=== Discovery Object
7272

73-
Use a ConfigMap:
73+
This Discovery Object is written by the Product Operator. It describes how the product can be accessed by other
74+
connecting entities (clients). These clients can be inside the same Kubernetes cluster, in an external Kubernetes
75+
cluster or deployed using a different approach like running inside a VM or on bare metal hardware. The discovery object
76+
is used both internally by the SDP and by other actors (for example users). Each Discovery Object is created in the same
77+
namespace the product runs in. This means cross-namespace discovery needs special attention.
78+
79+
==== Discovery Object: Option 1 - ConfigMap
7480

7581
[source,yaml]
7682
----
@@ -94,22 +100,20 @@ spec:
94100
authenticationSecretClass: client-kerberos
95101
----
96102

97-
==== Pros
103+
===== Pros
98104

99105
* Easier to use for consuming applications outside of Stackable, as they can simply mount the CM or download it to a
100106
file. On the other hand when we start to put in yaml objects that need to be parsed we would loose the benefit.
101107
* Single API call to retrieve all running products
102108

103-
==== Cons
109+
===== Cons
104110

105111
* No complex structure such as enums (we can mitigate this by sticking in custom yaml into the CM). Users don't have any
106112
form of validation when creating their own discovery CM e.g. pointing to their existing HDFS.
107113
* Cannot have two products with the same name, as the discovery CM name clashes. One solution could be to prefix the
108114
product name (e.g. trino-simple), This can impose other problems such as too long CM names.
109115

110-
=== [1] Discovery Object: Use dedicated CRD object for every product
111-
112-
Or use a dedicated HdfsClusterDiscovery crd:
116+
==== Discovery Object: Option 2 - Dedicated CRD Object for every Product
113117

114118
[source,yaml]
115119
----
@@ -133,30 +137,28 @@ spec:
133137
secretClass: client-tls # Use this SecretClass to obtain a keytab
134138
----
135139

136-
==== Pros
140+
===== Pros
137141

138142
* Validation by using e.g. complex enums
139143
* Commons structure can be shared between all operators, such as `Listener` endpoints or tls server certificate
140144
information
141145

142-
==== Cons
146+
===== Cons
143147

144-
* Operator A needs to compile against operator b to have access to it's discovery struct. An alternative would be to put
148+
* Operator A needs to compile against operator B to have access to it's discovery struct. An alternative would be to put
145149
the Discovery CRDs in operator-rs.
146150
* Operator versioning hell. On the other hand we have the same problem with ConfigMaps, as e.g. a newly introduced key
147151
is missing because of an older hdfs operator version.
148152
* Dependant Pods (such as hbase on hdfs) can not simply mount a CM containing the hdfs-site and core-site. Instead the
149153
hbase-operator needs to read the HdfsClusterDiscovery, copy the hdfs-site and core-site into a CM and than mount that
150154
into the hbase Pods. This can be solved by the HdfsClusterDiscovery to point to a CM that contains hdfs-site and
151-
core-site xmls.
155+
core-site XML files.
152156
* Multiple API calls need to retrieve all running Stackable service (in stackablectl or cockpit). This would be a single
153157
API call in case of discovery CM or a shared CRD for all product discoveries.
154158
* Side-Note: `stackablectl stacklet list` should *not* look at discovery objects, as they can come from a user and are
155159
external systems, where we don't know anything about.
156160

157-
=== [1] Discovery Object: Use dedicated CRD object for every product - in combination with ConfigMap
158-
159-
Or use a dedicated HdfsClusterDiscovery crd:
161+
==== Discovery Object: Option 3 - Dedicated CRD Object + ConfigMap for every Product
160162

161163
[source,yaml]
162164
----
@@ -210,15 +212,13 @@ spec:
210212
# No CM needed
211213
----
212214

213-
==== Pros
215+
===== Pros
214216

215217
* Fixes mount problem from `Discovery Object: Use dedicated CRD object for every product`
216218

217-
==== Cons
218-
219-
=== [1] Discovery Object: Use dedicated CRD object shared between all products
219+
===== Cons
220220

221-
Or use a dedicated ClusterDiscovery crd:
221+
==== Discovery Object: Option 4 - Dedicated CRD Object shared between all Products
222222

223223
[source,yaml]
224224
----
@@ -245,13 +245,13 @@ spec:
245245
# ...
246246
----
247247

248-
==== Pros
248+
===== Pros
249249

250250
* Only one struct in operator-rs => No cross-operator dependencies.
251251
* Single API call to retrieve all stackable products. Question is if this really helps a lot, as callers probably also
252252
are interested in the status of the product, which needs further API calls (irrelevant - see Cons).
253253

254-
==== Cons
254+
===== Cons
255255

256256
* All product discoveries are versioned together. E.g. a new mandatory field for hdfs requires all operators to bump the
257257
Discovery CRD to `v2`. We hope that this does not happen too often.
@@ -260,25 +260,30 @@ spec:
260260
systems, where we don't know anything about. So in case we want to introduce a `Stacklet` object listing anyway, so
261261
the `Pro` regarding the API calls is irrelevant.
262262

263-
=== [1] Discovery Object: Write the discovery to Product CR status
263+
==== Discovery Object: Option 5 - Write Discovery to Product CR Status
264264

265265
Instead of writing discovery information to dedicated objects - such as CM or custom CR - we "simply" write the
266266
discovery information to the status of the Cluster CR.
267267

268-
==== Pros
268+
===== Pros
269269

270-
==== Cons
270+
* None currently
271+
272+
===== Cons
271273

272274
* It does not enable users to bring their own product and talk to it from Stackable, e.g. a user-provided HDFS.
273275
* It does not allow things such as a ZNode for Zookeeper as we either use the Zookeeper CR for discovery or we use a
274276
ZNode but than can't use a Zookeeper CR. Currently we have the freedom of either connection to a Zookeeper root dir or
275277
a ZNode transparently.
276278

277-
=== [2] TLS: Discovery config contains SecretClass
279+
'''
278280

279-
The discovery includes the SecretClass used to obtain the ca.crt used to validate the *server* certificate
281+
=== TLS
282+
283+
==== TLS: Option 1 - Discovery Config contains SecretClass
284+
285+
The discovery includes the SecretClass used to obtain the ca.crt used to validate the *server* certificate.
280286

281-
Trino discovery:
282287
[source,yaml]
283288
----
284289
apiVersion: trino.stackable.tech/v1alpha1
@@ -305,18 +310,20 @@ backends: # Don't look at the Superset CRD structure, we are only interested in
305310
discovery: my-trino
306311
----
307312

308-
==== Pros
313+
===== Pros
309314

310-
==== Cons
315+
* Currently none
311316

312-
=== [2] TLS: Client needs to specify SecretClass
313-
---
317+
===== Cons
318+
319+
* Currently none
320+
321+
==== TLS: Option 2 - Client specifies SecretClass
314322

315323
The discovery does *not* include the SecretClass used to obtain the *server* certificate. Instead the client must
316324
specify which SecretClass should be used to verify the *server* certificate. For usability reasons it can be omitted and
317325
defaults to the SecretClass the client uses for itself.
318326

319-
Trino discovery:
320327
[source,yaml]
321328
----
322329
apiVersion: trino.stackable.tech/v1alpha1
@@ -343,22 +350,21 @@ backends: # Don't look at the Superset CRD structure, we are only interested in
343350
tlsSecretClass: my-second-pki
344351
----
345352

346-
==== Pros
353+
===== Pros
347354

348355
* Operator does not need to read/look at the DiscoveryConfig (as we can statically set up the secret-op tls secretClass
349356
volumes rather than retrieving them from the DiscoveryConfig).
350357
* Some clients only support a single pki, in that case we could not give the ability to overwrite the secretClass coming
351358
from the product itself.
352359

353-
==== Cons
360+
===== Cons
354361

355362
* The client has to know what pki/secretClass the server is using.
356363
* Superset TrinoConnection could not only say "Connect this Superset and this Trino", but need to say "using this ca.crt
357364
to validate the Trino server"
358365

359-
=== [2] TLS: Include caCert in Discovery config
366+
==== TLS: Option 3 - Include caCert in Discovery Config
360367

361-
Trino discovery:
362368
[source,yaml]
363369
----
364370
metadata:
@@ -376,23 +382,21 @@ endpoint:
376382
=== END CERTIFICATE ===
377383
----
378384

379-
==== Pros
385+
===== Pros
380386

381387
* Easier for external clients to use as they don't need to know the concept of SecretClasses and don't even need to run
382388
withing k8s.
383389
* The client has to *not* know what pki/secretClass the server is using.
384390

385-
==== Cons
391+
===== Cons
386392

387-
* BIG QUESTION: How should the product operator get the ca cert from the SecretClass it uses to get the *server* cert
393+
* BIG QUESTION: How should the product operator get the CA cert from the SecretClass it uses to get the *server* cert
388394
from?
389395
** The secret-op could e.g. offer an HTTP api to fetch the ca.crt of a given SecretClass or e.g. write the ca.crt into
390396
the status of a SecretClass
391397

392398

393-
=== [2] TLS: Include SecretClass in discovery, user can override it
394-
395-
Trino discovery:
399+
==== TLS: Option 4 - Include SecretClass in Discovery (User can override it)
396400

397401
[source,yaml]
398402
----
@@ -422,17 +426,20 @@ backends: # Don't look at the Superset CRD structure, we are only interested in
422426
tlsSecretClass: my-second-pki
423427
----
424428

425-
==== Pros
429+
===== Pros
426430

427431
* Compromise with all usability and flexibility
428432

429-
==== Cons
433+
===== Cons
430434

431435
* Less secure by default
432436

433-
=== [3] Authentication: Add AuthenticationClass to Discovery Config
437+
'''
438+
439+
=== Authentication
440+
441+
==== Authentication: Option 1 - Add AuthenticationClass to Discovery Config
434442

435-
Trino discovery:
436443
[source,yaml]
437444
----
438445
metadata:
@@ -441,23 +448,23 @@ authentication:
441448
authenticationClass: my-class
442449
----
443450

444-
==== Pros
451+
===== Pros
452+
445453
* IMPORTANT: This is the only thing the server can know (how he is verifying client identities). He can not recommend an
446454
SecretClass used to obtain the client credentials. E.g. he uses an LDAP AuthenticationClass, there is no way it can
447455
now what SecretClass provides credentials accepted by LDAP. (Most cases it will be a user logging into a WebUI and the
448456
LDAP credentials of the user are not even stored anywhere but just remembered by the user)
449457

450-
==== Cons
458+
===== Cons
451459

452460
* Operator has to read the AuthenticationClass to determine its type (pw/tls/keytab) and set up the needed volumes and
453461
commands.
454462
// * The AuthenticationClass is meant to describe "how should a server verify connecting clients" and re-purpose it to
455463
// mean "how a client should authenticate itself". Image a user creates a Secret `trino-users` with *only* a ca.crt
456464
// and a SecretClass `trino-users` on top. The connecting client than has no way of knowing how to get a client cert.
457465

458-
=== [3] Authentication: Add SecretClass to Discovery Config
466+
==== Authentication: Option 2 - Add SecretClass to Discovery Config
459467

460-
Trino discovery:
461468
[source,yaml]
462469
----
463470
metadata:
@@ -466,13 +473,17 @@ authentication:
466473
secretClass: client-tls # Use this SecretClass to obtain your credentials (regardless of type of SecretClass)
467474
----
468475

469-
==== Cons
476+
===== Pros
477+
478+
* Currently none
479+
480+
===== Cons
470481

471482
* Operator has to read the SecretClass to determine its type (pw/tls/keytab) and set up the needed volumes and commands.
472483
* Image then SecretClass is of type `k8sSearch`. The connection client (e.g. controlled via superset-operator) than has
473484
no idea if he should expect a tls.crd or a keytab when mounting the SecretClass.
474485

475-
=== [3] Authentication: Add needed details
486+
==== Authentication: Option 3 - Add needed Details
476487

477488
Trino discovery:
478489
[source,yaml]
@@ -488,27 +499,33 @@ authentication:
488499
secretClass: client-kerberos # Use this SecretClass to obtain a keytab
489500
----
490501

491-
==== Pros
502+
===== Pros
492503

493504
* Operator has *not* to read the SecretClass to determine its type (pw/tls/keytab), as the type is already encoded in
494505
the Discovery config.
495506

496-
==== Cons
507+
===== Cons
497508

498-
=== [3] Authentication: Don't add information how to authenticate
509+
* Currently none
510+
511+
==== Authentication: Option 4 - Provide no Authentication Information
499512

500513
Trino discovery does not provide any information on how to authenticate
501514

502-
==== Cons
515+
===== Pros
516+
517+
* Currently none
518+
519+
===== Cons
503520

504521
* Not viable, as users need to know how to connect, and are not expected to try 50 different auth methods. We need to
505522
give them a AuthenticationClass, that says them e.g. what LDAP or PKI is used.
506523

507524
== Decision Outcome
508525

509-
[1] Discovery Object: `Discovery Object: Use dedicated CRD object for every product - in combination with ConfigMap`
510-
[2] Server tls cert: TODO
511-
[3] Authentication: `Authentication: Add AuthenticationClass to Discovery Config`
526+
* *Discovery Object:* Option 3
527+
* *Server TLS:* TBD, but leaning towards Option 2
528+
* *Authentication:* Option 1
512529

513530
=== Appendix A
514531

0 commit comments

Comments
 (0)