diff --git a/docs/_data/sidebar.yml b/docs/_data/sidebar.yml index fe27a8e6aa..607bccf6a2 100644 --- a/docs/_data/sidebar.yml +++ b/docs/_data/sidebar.yml @@ -11,6 +11,8 @@ url: /docs/features - title: Dependent Resource Feature url: /docs/dependent-resources + - title: Workflows + url: /docs/workflows - title: Patterns and Best Practices url: /docs/patterns-best-practices - title: FAQ diff --git a/docs/documentation/dependent-resources.md b/docs/documentation/dependent-resources.md index 665e439f01..5296ea245a 100644 --- a/docs/documentation/dependent-resources.md +++ b/docs/documentation/dependent-resources.md @@ -137,7 +137,7 @@ See the full source code [here](https://github.com/java-operator-sdk/java-operat ## Managed Dependent Resources -As mentioned previously, one goal of this implementation is to make it possible to semi-declaratively create and wire +As mentioned previously, one goal of this implementation is to make it possible to declaratively create and wire dependent resources. You can annotate your reconciler with `@Dependent` annotations that specify which `DependentResource` implementation it depends upon. JOSDK will take the appropriate steps to wire everything together and call your @@ -145,8 +145,7 @@ appropriate steps to wire everything together and call your most use cases where the logic associated with the primary resource is usually limited to status handling based on the state of the secondary resources and the resources are not dependent on each other. -Note that all dependents will be reconciled in order. If an exception happens in one or more reconciliations, the -followup resources will be reconciled. +See [Workflows](https://javaoperatorsdk.io/docs/dependent-resources) how/ in what order the resources are reconciled. This behavior and automated handling is referred to as "managed" because the `DependentResource` instances are managed by JOSDK. @@ -186,15 +185,16 @@ sample [here](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/s ## Standalone Dependent Resources -To use dependent resources in more complex workflows, when there are some resources needs to be created only in certain -conditions the standalone mode is available or the dependent resources are not independent of each other. -For example if calling an API needs to happen if a service is already up and running -(think configuring a running DB instance). +In case just some or sub-set of the resources are desired to be managed by dependent resources use standalone mode. In practice this means that the developer is responsible to initializing and managing and -calling `reconcile` method. However, this gives possibility for developers to fully customize the workflow for +calling `reconcile` method. However, this gives possibility for developers to fully customize the process for reconciliation. Use standalone dependent resources for cases when managed does not fit. -The sample is similar to one above it just performs additional checks, and conditionally creates an `Ingress`: +Note that [Workflows](https://javaoperatorsdk.io/docs/dependent-resources) support also standalone mode using +standalone resources. + +The sample is similar to one above it just performs additional checks, and conditionally creates an `Ingress`: +(Note that now this condition creation is also possible with Workflows) ```java diff --git a/docs/documentation/workflows.md b/docs/documentation/workflows.md new file mode 100644 index 0000000000..e5454b0425 --- /dev/null +++ b/docs/documentation/workflows.md @@ -0,0 +1,276 @@ +--- +title: Workflows +description: Reference Documentation for Workflows +layout: docs +permalink: /docs/workflows +--- + +## Overview + +Kubernetes (k8s) does not have notion of a resource "depends on" on another k8s resource, +in terms of in what order a set of resources should be reconciled. However, Kubernetes operators are used to manage also +external (non k8s) resources. Typically, when an operator manages a service, after the service is first deployed +some additional API calls are required to configure it. In this case the configuration step depends +on the service and related resources, in other words the configuration needs to be reconciled after the service is +up and running. + +The intention behind workflows is to make it easy to describe more complex, almost arbitrary scenarios in a declarative +way. While [dependent resources](https://javaoperatorsdk.io/docs/dependent-resources) describes a logic how a single +resources should be reconciled, workflows describes the process how a set of target resources should be reconciled. + +Workflows are defined as a set of [dependent resources](https://javaoperatorsdk.io/docs/dependent-resources) (DR) +and dependencies between them, along with some conditions that mainly helps define optional resources and +pre- and post-conditions to describe expected states of a resource at a certain point in the workflow. + +## Elements of Workflow + +- **Dependent resource** (DR) - are the resources which are managed in reconcile logic. +- **Depends-on relation** - if a DR `B` depends on another DR `A`, means that `B` will be reconciled after `A`. +- **Reconcile precondition** - is a condition that needs to be fulfilled before the DR is reconciled. This allows also + to define optional resources, that for example only created if a flag in a custom resource `.spec` has some + specific value. +- **Ready postcondition** - checks if a resource could be considered "ready", typically if pods of a deployment are up + and running. +- **Delete postcondition** - during the cleanup phase it can be used to check if the resources is successfully deleted, + so the next resource on which the target resources depends can be deleted as next step. + +## Defining Workflows + +Similarly to dependent resources, there are two ways to define workflows, in managed and standalone manner. + +### Managed + +Annotations can be used to declaratively define a workflow for the reconciler. In this case the workflow is executed +before the `reconcile` method is called. The result of the reconciliation is accessed through the `context` object. + +Following sample shows a hypothetical sample to showcase all the elements, where there are two resources a Deployment and +a ConfigMap, where the ConfigMap depends on the deployment. Deployment has a ready condition so, the config map is only +reconciled after the Deployment and only if it is ready (see ready-postcondition). The ConfigMap has attached reconcile +precondition, therefore it is only reconciled if that condition holds. In addition to that has a delete-postCondition, +thus only considered to be deleted if that condition holds. + +```java +@ControllerConfiguration(dependents = { + @Dependent(name = DEPLOYMENT_NAME, type = DeploymentDependentResource.class, + readyPostcondition = DeploymentReadyCondition.class), + @Dependent(type = ConfigMapDependentResource.class, + reconcilePrecondition = ConfigMapReconcileCondition.class, + deletePostcondition = ConfigMapDeletePostCondition.class, + dependsOn = DEPLOYMENT_NAME) +}) +public class SampleWorkflowReconciler implements Reconciler, + Cleaner { + + public static final String DEPLOYMENT_NAME = "deployment"; + + @Override + public UpdateControl reconcile( + WorkflowAllFeatureCustomResource resource, + Context context) { + + resource.getStatus() + .setReady( + context.managedDependentResourceContext() // accessing workflow reconciliation results + .getWorkflowReconcileResult().orElseThrow() + .allDependentResourcesReady()); + return UpdateControl.patchStatus(resource); + } + + @Override + public DeleteControl cleanup(WorkflowAllFeatureCustomResource resource, + Context context) { + // emitted code + + return DeleteControl.defaultDelete(); + } +} + +``` + +### Standalone + +In this mode workflow is built manually using [standalone dependent resources](https://javaoperatorsdk.io/docs/dependent-resources#standalone-dependent-resources) +. The workflow is created using a builder, that is explicitly called in the reconciler (from web page sample): + +```java +@ControllerConfiguration( + labelSelector = WebPageDependentsWorkflowReconciler.DEPENDENT_RESOURCE_LABEL_SELECTOR) +public class WebPageDependentsWorkflowReconciler + implements Reconciler, ErrorStatusHandler, EventSourceInitializer { + + public static final String DEPENDENT_RESOURCE_LABEL_SELECTOR = "!low-level"; + private static final Logger log = + LoggerFactory.getLogger(WebPageDependentsWorkflowReconciler.class); + + private KubernetesDependentResource configMapDR; + private KubernetesDependentResource deploymentDR; + private KubernetesDependentResource serviceDR; + private KubernetesDependentResource ingressDR; + + private Workflow workflow; + + public WebPageDependentsWorkflowReconciler(KubernetesClient kubernetesClient) { + initDependentResources(kubernetesClient); + workflow = new WorkflowBuilder() + .addDependent(configMapDR).build() + .addDependent(deploymentDR).build() + .addDependent(serviceDR).build() + .addDependent(ingressDR).withReconcileCondition(new IngressCondition()).build() + .build(); + } + + @Override + public Map prepareEventSources(EventSourceContext context) { + return EventSourceInitializer.nameEventSources( + configMapDR.initEventSource(context), + deploymentDR.initEventSource(context), + serviceDR.initEventSource(context), + ingressDR.initEventSource(context)); + } + + @Override + public UpdateControl reconcile(WebPage webPage, Context context) { + + var result = workflow.reconcile(webPage, context); + + webPage.setStatus(createStatus(result)); + return UpdateControl.patchStatus(webPage); + } + // emitted code +} + +``` + +## Workflow Execution + +This section describes how a workflow is executed in details, how is the ordering determined and how condition and +errors affect the behavior. The workflow execution as also its API denotes, can be divided to into two parts, +the reconciliation and cleanup. [Cleanup](https://javaoperatorsdk.io/docs/features#the-reconcile-and-cleanup) is +executed if a resource is marked for deletion. + + +## Common Principles + +- **As complete as possible execution** - when a workflow is reconciled, it tries to reconcile as many resources as + possible. Thus is an error happens or a ready condition is not met for a resources, all the other independent resources + will be still reconciled. This is the opposite to fail-fast approach. The assumption is that eventually in this way the + overall desired state is achieved faster than with a fail fast approach. +- **Concurrent reconciliation of independent resources** - the resources which are not dependent on each are processed + concurrently. The level of concurrency is customizable, could be set to one if required. By default, workflows use + the executor service from [ConfigurationService](https://github.com/java-operator-sdk/java-operator-sdk/blob/6f2a252952d3a91f6b0c3c38e5e6cc28f7c0f7b3/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationService.java#L120-L120) + +## Reconciliation + +This section describes how a workflow is executed, first the rules are defined, then are explained on samples: + +### Rules + + 1. DR is reconciled if it does not depend on another DR, or ALL the DRs it depends on are ready. In case it + has a reconcile-precondition that condition must be met too. (So here ready means that it is successfully + reconciled - without any error - and if it has a ready condition that condition is met). + 2. If a reconcile-precondition of a DR is not met, it is deleted. If there are dependent resources which depends on it + are deleted too as first - this applies recursively. That means that DRs are always deleted in revers order compared + how are reconciled. + 3. Delete is called on a dependent resource if as described in point 2. it (possibly transitively) depends on a DR which + did not meet it's reconcile condition, and has no DRs depends on it, or if the DR-s which depends on it are already + successfully deleted (within actual execution). "Delete is called" means, that the dependent resource is checked + if it implements `Deleter` interface, if implements it but do not implement `GarbageCollected` interface, + the `Deleter.delete` method called. If a DR does not implement `Deleter` interface, it is considered as deleted + automatically. Successfully deleted means, that it is deleted and if a delete-postcondition is present it is met. + +### Samples + +Notation: The arrows depicts reconciliation ordering, or in depends-on relation in reverse direction: +`1 --> 2` mean `DR 2` depends-on `DR 1`. + +#### Reconcile Sample + +
+ +stateDiagram-v2 +1 --> 2 +1 --> 3 +2 --> 4 +3 --> 4 + +
+ +- At the workflow the reconciliation of the nodes would happen in the following way. DR with index `1` is reconciled. + After that DR `2` and `3` is reconciled concurrently, if both finished their reconciliation, node `4` is reconciled too. +- In case for example `2` would have a ready condition, that would be evaluated as "not met", `4` would not be reconciled. + However `1`,`2` and `3` would be reconciled. +- In case `1` would have a ready condition that is not met, neither `2`,`3` or `4` would be reconciled. +- If there would be an error during the reconciliation of `2`, `4` would not be reconciled, but `3` would be + (also `1` of course). + +#### Sample with Reconcile Precondition + +
+ +stateDiagram-v2 +1 --> 2 +1 --> 3 +3 --> 4 +3 --> 5 + +
+ +- Considering this sample for case `3` has reconcile-precondition, what is not met. In that case DR `1` and `2` would be + reconciled. However, DR `3`,`4`,`5` would be deleted in the following way. DR `4` and `5` would be deleted concurrently. + DR `3` would be deleted if `4` and `5` is deleted successfully, thus no error happened during deletion and all + delete-postconditions are met. + - If delete-postcondition for `5` would not be met `3` would not be deleted; `4` would be. + - Similarly, in there would be an error for `5`, `3` would not be deleted, `4` would be. + +## Cleanup + +Cleanup works identically as delete for resources in reconciliation in case reconcile-precondition is not met, just for +the whole workflow. + +The rule is relatively simple: + +Delete is called on a DR if there is no DR that depends on it, or if the DR-s which depends on it are +already deleted successfully (withing this execution of workflow). Successfully deleted means, that it is deleted and +if a delete-postcondition is present it is met. "Delete is called" means, that the dependent resource is checked if it +implements `Deleter` interface, if implements it but do not implement `GarbageCollected` interface, the `Deleter.delete` +method called. If a DR does not implement `Deleter` interface, it is considered as deleted automatically. + +### Sample + +
+ +stateDiagram-v2 +1 --> 2 +1 --> 3 +2 --> 4 +3 --> 4 + +
+ +- The DRs are deleted in the following order: `4` is deleted, after `2` and `3` are deleted concurrently, after both + succeeded `1` is deleted. +- If delete-postcondition would not be met for `2`, node `1` would not be deleted. DR `4` and `3` would be deleted. +- If `2` would be errored, DR `1` would not be deleted. DR `4` and `3` would be deleted. +- if `4` would be errored, no other DR would be deleted. + +## Error Handling + +As mentioned before if an error happens during a reconciliation, the reconciliation of other dependent resources will +still happen. There might a case that multiple DRs are errored, therefore workflows throws an +['AggregatedOperatorException'](https://github.com/java-operator-sdk/java-operator-sdk/blob/86e5121d56ed4ecb3644f2bc8327166f4f7add72/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/AggregatedOperatorException.java) +that will contain all the related exceptions. + +The exceptions can be handled by [`ErrorStatusHandler`](https://github.com/java-operator-sdk/java-operator-sdk/blob/86e5121d56ed4ecb3644f2bc8327166f4f7add72/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/AggregatedOperatorException.java) + +## Notes and Caveats + +- Delete is almost always called on every resource during the cleanup. However, it might be the case that the resources + was already deleted in a previous run, or not even created. This should not be a problem, since dependent resources + usually cache the state of the resource, so are already aware that the resource not exists, thus basically doing nothing + if delete is called on an already not existing resource. +- If a resource has owner references, it will be automatically deleted by Kubernetes garbage collector if + the owner resource is marked for deletion. This might not be desirable, to make sure that delete is handled by the + workflow don't use garbage collected kubernetes dependent resource, use for example [`CRUDNoGCKubernetesDependentResource`](https://github.com/java-operator-sdk/java-operator-sdk/blob/86e5121d56ed4ecb3644f2bc8327166f4f7add72/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/dependent/kubernetes/CRUDNoGCKubernetesDependentResource.java). +- After a workflow executed no state is persisted regarding the workflow execution. On every reconciliation + all the resources are reconciled again, in other words the whole workflow is evaluated again. + diff --git a/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/Operator.java b/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/Operator.java index 7679267a78..a17e86d8cc 100644 --- a/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/Operator.java +++ b/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/Operator.java @@ -68,7 +68,7 @@ public Operator(KubernetesClient kubernetesClient, ConfigurationService configur ConfigurationServiceProvider.set(configurationService); } - /** Adds a shutdown hook that automatically calls {@link #stop()} ()} when the app shuts down. */ + /** Adds a shutdown hook that automatically calls {@link #stop()} when the app shuts down. */ public void installShutdownHook() { Runtime.getRuntime().addShutdownHook(new Thread(this::stop)); }