You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This documentation gives you an overview of our platform, how to install and manage it as well as some tutorials.
5
+
This documentation gives you an overview of the Stackable Data Platform, how to install and manage it as well as some tutorials.
12
6
13
7
++++
14
8
<br>
@@ -26,14 +20,11 @@ This documentation gives you an overview of our platform, how to install and man
26
20
<h3>Introduction</h3>
27
21
++++
28
22
29
-
If you have any feedback regarding the documentation please either open an https://github.com/stackabletech/documentation/issues[issue], ask a https://github.com/stackabletech/documentation/discussions[question] or look at the source for this documentation in its https://github.com/stackabletech/documentation[repository].
30
-
31
-
Our Stackable platform allows you to deploy, scale and manage Big Data infrastructure in any environment as long as it can run https://kubernetes.io/[Kubernetes].
23
+
The Stackable Data Platform allows you to deploy, scale and manage Data infrastructure in any environment running https://kubernetes.io/[Kubernetes].
32
24
25
+
You can find an overview of the supported components <<Components,below>>, as well as a full list of all supported product versions xref:operators:supported_versions.adoc[here].
33
26
34
-
IMPORTANT: Our platform used to be based on a different architecture (until November 2021) where we built an alternative Kubelet. We abandoned that effort for now and are moving towards a Kubernetes-native experience using the normal `kubelet`.
35
-
This is an ongoing effort and the documentation might not reflect reality everywhere.
36
-
We aim to update our platform as well as the documentation by the end of 2021 and will remove this note when the migration has completed.
27
+
If you have any feedback regarding the documentation please either open an https://github.com/stackabletech/documentation/issues[issue], ask a https://github.com/stackabletech/documentation/discussions[question] or look at the source for this documentation in its https://github.com/stackabletech/documentation[repository].
37
28
38
29
++++
39
30
</div>
@@ -54,7 +45,7 @@ While the platform got started in the _Big Data_ ecosystem we are in no way limi
54
45
55
46
You can declaratively build these environments, and we don't stop at the tool level as we also provide ways for the users to interact with the platform in the "as Code"-approach.
56
47
57
-
We are leveraging the https://www.openpolicyagent.org/[OpenPolicyAgent] to provide Authorization-as-Code.
48
+
We are leveraging the https://www.openpolicyagent.org/[Open Policy Agent] to provide Security-as-Code.
58
49
59
50
We are building a distribution that includes the “best of breed” of existing Open Source tools, but bundles them in a way, so it is easy to deploy a fully working stack of software. Most of the existing tools are “single purpose” tools, which often do not play nicely together out-of-the-box.
60
51
@@ -70,7 +61,8 @@ We are building a distribution that includes the “best of breed” of existing
70
61
71
62
We are using Kubernetes as our deployment platform.
72
63
And we're building https://kubernetes.io/docs/concepts/extend-kubernetes/operator/[Operators] for each of the products we support.
73
-
At the moment we support the following products (i.e. we have operators for each of those):
64
+
65
+
The Stackable Data Platform supports the following products:
74
66
75
67
++++
76
68
<br>
@@ -85,12 +77,60 @@ At the moment we support the following products (i.e. we have operators for each
85
77
++++
86
78
87
79
++++
88
-
<h3>Kafka Operator</h3>
80
+
<h3>Apache Airflow</h3>
81
+
++++
82
+
83
+
Airflow is a workflow engine and your replacement should you be using Apache Oozie.
84
+
85
+
xref:airflow::index.adoc[Read more]
86
+
87
+
++++
88
+
</div>
89
+
++++
90
+
91
+
++++
92
+
<div class="box">
93
+
++++
94
+
95
+
++++
96
+
<h3>Apache Druid</h3>
97
+
++++
98
+
99
+
Apache Druid is a real-time database to power modern analytics applications.
100
+
101
+
xref:druid::index.adoc[Read more]
102
+
103
+
++++
104
+
</div>
105
+
++++
106
+
107
+
++++
108
+
<div class="box">
109
+
++++
110
+
111
+
++++
112
+
<h3>Apache HBase</h3>
113
+
++++
114
+
115
+
HBase is a distributed, scalable, big data store.
116
+
117
+
xref:hbase::index.adoc[Read more]
118
+
119
+
++++
120
+
</div>
121
+
++++
122
+
123
+
++++
124
+
<div class="box">
125
+
++++
126
+
127
+
++++
128
+
<h3>Apache Hadoop HDFS</h3>
89
129
++++
90
130
91
-
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
131
+
HDFS is a distributed file system that provides high-throughput access to application data.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
147
+
The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. We support the Hive Metastore.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
163
+
Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
179
+
An easy to use, powerful, and reliable system to process and distribute data.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
195
+
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
211
+
Apache Superset is a modern data exploration and visualization platform.
172
212
173
-
link:/superset/index.html[Read more]
213
+
xref:superset::index.adoc[Read more]
174
214
175
215
++++
176
216
</div>
177
217
++++
178
218
219
+
++++
220
+
<div class="box">
221
+
++++
222
+
223
+
++++
224
+
<h3>Trino</h3>
225
+
++++
226
+
227
+
Fast distributed SQL query engine for big data analytics that helps you explore your data universe.
228
+
229
+
xref:trino::index.adoc[Read more]
230
+
231
+
++++
232
+
</div>
233
+
++++
234
+
235
+
++++
236
+
<div class="box">
237
+
++++
238
+
239
+
++++
240
+
<h3>Apache ZooKeeper</h3>
241
+
++++
242
+
243
+
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
0 commit comments