You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/concepts/pages/s3.adoc
+18-15Lines changed: 18 additions & 15 deletions
Original file line number
Diff line number
Diff line change
@@ -43,16 +43,21 @@ spec:
43
43
S3Bucket(s) reference S3Connection(s) objects. Both types of objects can be referenced by other resources. For example in a DruidCluster you can specify a bucket for deep storage and an S3Connection for data ingestion.
44
44
S3Connection objects can be defined in a standalone fashion or they can be inlined into a bucket object. Similarly a bucket can be defined in a standalone object or inlined into an enclosing object.
45
45
46
-
image::s3-overview.drawio.svg[TODO alt text]
46
+
image::s3-overview.drawio.svg[A diagram showing four variations (A, B, C, D) of S3 resource referencing.]
47
47
48
-
The diagram above shows three examples of how the objects can be
49
-
structured.
50
-
// Option 1
51
-
In option 1 all objects are separate from each other. This provides maximum re-usability because the same connection or bucket object can be referenced by multiple resources. It also allows for separation of concerns across team members. Cluster administrators can define S3 connection objects that developers reference in their applications.
52
-
// Option 2
53
-
In option 2 the bucket is inlined in the cluster definition. This makes sense if you have a dedicated bucket for a specific purpose, if it is only used in this one cluster instance, in this single product.
54
-
// Option 3
55
-
Option 3 shows all S3 objects inlined in a DruidCluster resource. This is a very convenient way to quickly test something since the entire configuration is encapsulated in a single but potentially large manifest.
48
+
The diagram above shows four examples of how the objects can be structured.
49
+
50
+
// Variant A
51
+
Variant A shows all S3 objects inlined in a DruidCluster resource. This is a very convenient way to quickly test something since the entire configuration is encapsulated in a single but potentially large manifest.
52
+
53
+
// Variant B
54
+
In variant B the S3 bucket has been split out into its own resource. It can now be referred to by multiple different tools as well.
55
+
56
+
// Variant C
57
+
In variant C the bucket is inlined in the cluster definition. This makes sense if you have a dedicated bucket for a specific purpose, if it is only used in this one cluster instance, in this single product, but they are still hosted in the same place, so they still share a connection.
58
+
59
+
// Variant D
60
+
In variant D all objects are separate from each other. This provides maximum re-usability because the same connection or bucket object can be referenced by multiple resources. It also allows for separation of concerns across team members. Cluster administrators can define S3 connection objects that developers reference in their applications.
56
61
57
62
=== Examples
58
63
@@ -73,9 +78,9 @@ spec:
73
78
74
79
==== Inline definition
75
80
76
-
The inline definition is variant 3 in the figure above.
81
+
The inline definition is variant A in the figure above.
77
82
78
-
image::s3-inline.drawio.svg[TODO alt text]
83
+
image::s3-inline.drawio.svg[The DruidCluster encapsulates an S3Bucket, which in turn contains an S3Connection]
79
84
80
85
This variant has the advantage that everything is defined in a single file, right where it is going to be used:
81
86
@@ -105,7 +110,7 @@ spec:
105
110
106
111
Often multiple buckets are used across a data pipeline, as well as buckets being used by different applications, so stand-alone resource definitions that can be referenced from multiple objects make sense.
107
112
108
-
image::s3-fully-separated.drawio.svg[TODO alt text]
113
+
image::s3-fully-separated.drawio.svg[One S3Connection is referenced by two different S3Buckets. The first Bucket is referenced by a DruidCluster and the second bucket is referenced by a SparkCluster and TrinoCluster. No object is inlined.]
109
114
110
115
The DruidCluster references the S3Bucket, which in turn references the S3Connection. First the definition of the S3Connection:
111
116
@@ -123,7 +128,6 @@ spec:
123
128
124
129
Then the bucket, which references the connection:
125
130
126
-
127
131
[source,yaml]
128
132
----
129
133
---
@@ -155,7 +159,6 @@ spec:
155
159
156
160
== Credentials
157
161
158
-
159
162
No matter if a connection is specified inline or as a separate object, the credentials are always specified in the same way. You will need a `Secret` containing the access key ID and secret access key, a `SecretClass` and then a reference to this `SecretClass` where you want to specify the credentials.
160
163
161
164
The `Secret`:
@@ -202,4 +205,4 @@ credentials:
202
205
203
206
== What's next
204
207
205
-
- Find details about the options of the S3 resource in the xref:reference:s3.adoc[S3 resources reference].
208
+
Find details about the options of the S3 resource in the xref:reference:s3.adoc[S3 resources reference].
0 commit comments