Skip to content

23.4/airflow changes #403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions antora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: home
version: "nightly"
version: "23.4"
title: Stackable Documentation
nav:
- modules/ROOT/nav.adoc
Expand All @@ -9,4 +9,4 @@ nav:
- modules/operators/nav.adoc
- modules/contributor/nav.adoc
- modules/ROOT/nav2.adoc
prerelease: true
prerelease: false
24 changes: 15 additions & 9 deletions modules/ROOT/pages/release_notes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,23 +12,23 @@ The following new major platform features were added:

Cluster Operation::

The first part of https://docs.stackable.tech/home/stable/concepts/cluster_operations.html[ClusterOperations] was rolled out in every applicable Stackable Operator. This supports pausing the cluster reconciliation and stopping the cluster completely. Pausing reconciliation will not apply any changes to the Kubernetes resources (e.g. when changing the custom resource). Stopping the cluster will set all replicas of StatefulSets, Deployments or DaemonSets to zero and therefore deleting all Pods belonging to that cluster (not the PVCs).
The first part of xref:concepts:cluster_operations.adoc[Cluster operations] was rolled out in every applicable Stackable Operator. This supports pausing the cluster reconciliation and stopping the cluster completely. Pausing reconciliation will not apply any changes to the Kubernetes resources (e.g. when changing the custom resource). Stopping the cluster will set all replicas of StatefulSets, Deployments or DaemonSets to zero and therefore deleting all Pods belonging to that cluster (not the PVCs).

Status Field::

Operators of the Stackable Data Platform create, manage and delete Kubernetes resources: in order to easily query the health state of the products - and react accordingly - Stackable Operators use several predefined condition types to capture different aspects of a product's availability. See this https://docs.stackable.tech/home/stable/contributor/adr/ADR027-status[ADR] for more information.
Operators of the Stackable Data Platform create, manage and delete Kubernetes resources: in order to easily query the health state of the products - and react accordingly - Stackable Operators use several predefined condition types to capture different aspects of a product's availability. See this xref:contributor:adr/ADR027-status[ADR] for more information.

Default / Custom Affinities::

In Kubernetes there are different ways to influence how Pods are assigned to Nodes. In some cases it makes sense to co-locate certain services that communicate a lot with each other, such as HBase regionservers with HDFS datanodes. In other cases it makes sense to distribute the Pods among as many Nodes as possible. There may also be additional requirements e.g. placing important services - such as HDFS namenodes - in different racks, datacenter rooms or even datacenters. This release implements default affinities that should suffice for many scenarios out-of-the box, while also allowing for custom affinity rules at a role and/or role-group level. See this https://docs.stackable.tech/home/stable/contributor/adr/ADR026-affinities[ADR] for more information.
In Kubernetes there are different ways to influence how Pods are assigned to Nodes. In some cases it makes sense to co-locate certain services that communicate a lot with each other, such as HBase regionservers with HDFS datanodes. In other cases it makes sense to distribute the Pods among as many Nodes as possible. There may also be additional requirements e.g. placing important services - such as HDFS namenodes - in different racks, datacenter rooms or even datacenters. This release implements default affinities that should suffice for many scenarios out-of-the box, while also allowing for custom affinity rules at a role and/or role-group level. See this xref:contributor:adr/ADR026-affinities.adoc[ADR] for more information.

Log Aggregation::

The logging framework (added to the platform in Release 23.1) offers a consistent custom resource configuration and a separate, persisted sink (defaulting to OpenSearch). This has now been rolled out across all products. See this https://docs.stackable.tech/home/stable/contributor/adr/adr025-logging_architecture[ADR] and this https://docs.stackable.tech/home/stable/concepts/logging.html[concepts page] for more information.
The logging framework (added to the platform in Release 23.1) offers a consistent custom resource configuration and a separate, persisted sink (defaulting to OpenSearch). This has now been rolled out across all products. See this xref:contributor:adr/adr025-logging_architecture[ADR] and this xref:concepts:logging.adoc[concepts page] for more information.

Service Type::

The Service type can now be specified in all products. This currently differentiates between the internal ClusterIP and the external NodePort and is forward compatible with the [ListenerClass](https://docs.stackable.tech/home/stable/listener-operator/listenerclass.html) for the automatic exposure of Services via the Listener Operator. This change is not backwards compatible with older platform releases. For security reasons, the default is set to the cluster-internal (ClusterIP) ListenerClass. A cluster can be exposed outside of Kubernetes by setting clusterConfig.listenerClass to external-unstable (NodePort) or external-stable (LoadBalancer).
The Service type can now be specified in all products. This currently differentiates between the internal ClusterIP and the external NodePort and is forward compatible with the xref:listener-operator:listenerclass.adoc[ListenerClass] for the automatic exposure of Services via the Listener Operator. This change is not backwards compatible with older platform releases. For security reasons, the default is set to the cluster-internal (ClusterIP) ListenerClass. A cluster can be exposed outside of Kubernetes by setting clusterConfig.listenerClass to external-unstable (NodePort) or external-stable (LoadBalancer).

New Versions::

Expand All @@ -45,7 +45,7 @@ Additionally, there are some individual product features that are noteworthy:
* https://github.com/stackabletech/airflow-operator/issues/177[Apache Airflow: load DAGs per git-sync]
* https://github.com/stackabletech/hdfs-operator/issues/289[Apache HDFS: Rework HDFS TLS / Auth structs]
* https://github.com/stackabletech/trino-operator/issues/395[Trino: Rework HDFS TLS / Auth structs]
* https://github.com/stackabletech/secret-operator/issues/243[Secret operator: support running the Secret operator in rootless mode ]
* https://github.com/stackabletech/secret-operator/pull/252[Secret operator: support running the Secret operator in unprivileged mode ]
* https://github.com/stackabletech/secret-operator/pull/235[Secret operator: allow configuring CSI docker images]
* https://github.com/stackabletech/secret-operator/issues/4[Secret operator: Kerberos keytab provisioning]

Expand All @@ -55,7 +55,7 @@ The following have been added to `stackablectl`:

==== Trino-iceberg demo

This is a condensed form of the https://docs.stackable.tech/stackablectl/stable/demos/data-lakehouse-iceberg-trino-spark.html[data-lakehouse-iceberg-trino-spark] demo focusing on using the lakehouse to store and modify data. It demonstrates how to integrate Trino and Iceberg and should run on a local workstation.
This is a condensed form of the xref:stackablectl::demos/data-lakehouse-iceberg-trino-spark.adoc[] demo focusing on using the lakehouse to store and modify data. It demonstrates how to integrate Trino and Iceberg and should run on a local workstation.

==== Jupyterhub/Spark demo

Expand Down Expand Up @@ -101,6 +101,12 @@ spec:
```
This is an example for Trino, but the pattern is the same across all operators.

==== Stackable Operator for Apache Airflow

Existing Airflow clusters need to be deleted and recreated.

This is required because the UID of the Airflow user has https://github.com/stackabletech/airflow-operator/pull/219[changed] to be in line with the rest of the platform.

==== Stackable Operator for Apache HBase

* https://github.com/stackabletech/hbase-operator/issues/329[Consolidated top level configuration to clusterConfig]
Expand Down Expand Up @@ -452,7 +458,7 @@ This illustrates how to set up logging for Zookeeper and browse the results in a

==== LDAP stack and tutorial

LDAP support has now been added to multiple products. An explanation of the overall approach is given xref:concepts:authentication.adoc[here] but in order to make the configuration steps a little clearer a xref:tutorials:authentication_with_openldap.adoc[tutorial] has been added that uses a dedicated Stackable https://docs.stackable.tech/stackablectl/stable/commands/stack.html[stack] for OpenLDAP and shows its usage.
LDAP support has now been added to multiple products. An explanation of the overall approach is given xref:concepts:authentication.adoc[here] but in order to make the configuration steps a little clearer a xref:tutorials:authentication_with_openldap.adoc[tutorial] has been added that uses a dedicated Stackable xref:stackablectl::commands/stack.adoc[stack] for OpenLDAP and shows its usage.


The xref:stackablectl::quickstart.adoc[quickstart guide] shows how to get started with `stackablectl`. This link lists the xref:stackablectl::demos/index.adoc[available demos].
Expand Down Expand Up @@ -832,7 +838,7 @@ This is the third release of the Stackable Data Platform, which this time focuse
The following new major platform features were added:

CPU and memory limits configurable::
The operators now https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/[request] resources from Kubernetes for the products and required CPU and memory can now also be configured for all products. If your product instances are less performant after the update, the new defaults might be set too low and we recommend to https://docs.stackable.tech/kafka/stable/usage.html#_resource_requests[set custom requests] for your cluster.
The operators now https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/[request] resources from Kubernetes for the products and required CPU and memory can now also be configured for all products. If your product instances are less performant after the update, the new defaults might be set too low and we recommend to xref:kafka:usage-guide/storage-resources.adoc[set custom requests] for your cluster.

* https://github.com/stackabletech/opa-operator/pull/347[OpenPolicyAgent]
* https://github.com/stackabletech/zookeeper-operator/pull/563[Apache ZooKeeper]
Expand Down
Loading