Skip to content

chore: upgrade to fabric8 client 6.3.0 #1656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 35 additions & 1 deletion docs/documentation/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -699,7 +699,41 @@ leader left off should one of them become elected leader.
See sample configuration in the [E2E test](https://github.com/java-operator-sdk/java-operator-sdk/blob/8865302ac0346ee31f2d7b348997ec2913d5922b/sample-operators/leader-election/src/main/java/io/javaoperatorsdk/operator/sample/LeaderElectionTestOperator.java#L21-L23)
.

## Monitoring with Micrometer
## Runtime Info

[RuntimeInfo](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/RuntimeInfo.java#L16-L16)
is used mainly to check the actual health of event sources. Based on this information it is easy to implement custom
liveness probes.

[stopOnInformerErrorDuringStartup](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationService.java#L168-L168)
setting, where this flag usually needs to be set to false, in order to control the exact liveness properties.

See also an example implementation in the
[WebPage sample](https://github.com/java-operator-sdk/java-operator-sdk/blob/3e2e7c4c834ef1c409d636156b988125744ca911/sample-operators/webpage/src/main/java/io/javaoperatorsdk/operator/sample/WebPageOperator.java#L38-L43)

## Optimization of Caches

** Cache pruning is an experimental feature. Might a subject of change or even removal in the future. **

Operators using informers will initially cache the data for all known resources when starting up
so that access to resources can be performed quickly. Consequently, the memory required for the
operator to run and startup time will both increase quite dramatically when dealing with large
clusters with numerous resources.

It is thus possible to configure the operator to cache only pruned versions of the resources to
alleviate the memory usage of the primary and secondary caches. This setup, however, has
implications on how reconcilers deal with resources since they will only work with partial
objects. As a consequence, resources need to be updated using PATCH operations only, sending
only required changes.

To see how to use, and how to handle related caveats regarding how to deal with pruned objects
that leverage
[server side apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/) patches,
please check the provided
[integration test](https://github.com/java-operator-sdk/java-operator-sdk/blob/c688524e64205690ba15587e7ed96a64dc231430/operator-framework/src/test/java/io/javaoperatorsdk/operator/CachePruneIT.java)
and associates reconciler.

Pruned caches are currently not supported with the Dependent Resources feature.

## Automatic Generation of CRDs

Expand Down
2 changes: 1 addition & 1 deletion micrometer-support/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<parent>
<artifactId>java-operator-sdk</artifactId>
<groupId>io.javaoperatorsdk</groupId>
<version>4.1.3-SNAPSHOT</version>
<version>4.2.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,27 +5,56 @@
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.atomic.AtomicInteger;

import io.fabric8.kubernetes.api.model.HasMetadata;
import io.javaoperatorsdk.operator.OperatorException;
import io.javaoperatorsdk.operator.api.monitoring.Metrics;
import io.javaoperatorsdk.operator.api.reconciler.Constants;
import io.javaoperatorsdk.operator.api.reconciler.RetryInfo;
import io.javaoperatorsdk.operator.processing.Controller;
import io.javaoperatorsdk.operator.processing.GroupVersionKind;
import io.javaoperatorsdk.operator.processing.event.Event;
import io.javaoperatorsdk.operator.processing.event.ResourceID;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Tag;
import io.micrometer.core.instrument.Timer;

import static io.javaoperatorsdk.operator.api.reconciler.Constants.CONTROLLER_NAME;

public class MicrometerMetrics implements Metrics {

private static final String PREFIX = "operator.sdk.";
private static final String RECONCILIATIONS = "reconciliations.";
private static final String RECONCILIATIONS_EXECUTIONS = PREFIX + RECONCILIATIONS + "executions.";
private static final String RECONCILIATIONS_QUEUE_SIZE = PREFIX + RECONCILIATIONS + "queue.size.";
private final MeterRegistry registry;
private final Map<String, AtomicInteger> gauges = new ConcurrentHashMap<>();

public MicrometerMetrics(MeterRegistry registry) {
this.registry = registry;
}

@Override
public void controllerRegistered(Controller<?> controller) {
String executingThreadsName =
RECONCILIATIONS_EXECUTIONS + controller.getConfiguration().getName();
AtomicInteger executingThreads =
registry.gauge(executingThreadsName,
gvkTags(controller.getConfiguration().getResourceClass()),
new AtomicInteger(0));
gauges.put(executingThreadsName, executingThreads);

String controllerQueueName =
RECONCILIATIONS_QUEUE_SIZE + controller.getConfiguration().getName();
AtomicInteger controllerQueueSize =
registry.gauge(controllerQueueName,
gvkTags(controller.getConfiguration().getResourceClass()),
new AtomicInteger(0));
gauges.put(controllerQueueName, controllerQueueSize);
}

public <T> T timeControllerExecution(ControllerExecution<T> execution) {
final var name = execution.controllerName();
final var execName = PREFIX + "controllers.execution." + execution.name();
Expand Down Expand Up @@ -84,38 +113,67 @@ public void cleanupDoneFor(ResourceID resourceID, Map<String, Object> metadata)
}

@Override
public void reconcileCustomResource(ResourceID resourceID, RetryInfo retryInfoNullable,
public void reconcileCustomResource(HasMetadata resource, RetryInfo retryInfoNullable,
Map<String, Object> metadata) {
Optional<RetryInfo> retryInfo = Optional.ofNullable(retryInfoNullable);
incrementCounter(resourceID, RECONCILIATIONS + "started",
incrementCounter(ResourceID.fromResource(resource), RECONCILIATIONS + "started",
metadata,
RECONCILIATIONS + "retries.number",
"" + retryInfo.map(RetryInfo::getAttemptCount).orElse(0),
RECONCILIATIONS + "retries.last",
"" + retryInfo.map(RetryInfo::isLastAttempt).orElse(true));

AtomicInteger controllerQueueSize =
gauges.get(RECONCILIATIONS_QUEUE_SIZE + metadata.get(CONTROLLER_NAME));
controllerQueueSize.incrementAndGet();
}

@Override
public void finishedReconciliation(ResourceID resourceID, Map<String, Object> metadata) {
incrementCounter(resourceID, RECONCILIATIONS + "success", metadata);
public void finishedReconciliation(HasMetadata resource, Map<String, Object> metadata) {
incrementCounter(ResourceID.fromResource(resource), RECONCILIATIONS + "success", metadata);
}

public void failedReconciliation(ResourceID resourceID, Exception exception,
@Override
public void reconciliationExecutionStarted(HasMetadata resource, Map<String, Object> metadata) {
AtomicInteger reconcilerExecutions =
gauges.get(RECONCILIATIONS_EXECUTIONS + metadata.get(CONTROLLER_NAME));
reconcilerExecutions.incrementAndGet();
}

@Override
public void reconciliationExecutionFinished(HasMetadata resource, Map<String, Object> metadata) {
AtomicInteger reconcilerExecutions =
gauges.get(RECONCILIATIONS_EXECUTIONS + metadata.get(CONTROLLER_NAME));
reconcilerExecutions.decrementAndGet();

AtomicInteger controllerQueueSize =
gauges.get(RECONCILIATIONS_QUEUE_SIZE + metadata.get(CONTROLLER_NAME));
controllerQueueSize.decrementAndGet();
}

public void failedReconciliation(HasMetadata resource, Exception exception,
Map<String, Object> metadata) {
var cause = exception.getCause();
if (cause == null) {
cause = exception;
} else if (cause instanceof RuntimeException) {
cause = cause.getCause() != null ? cause.getCause() : cause;
}
incrementCounter(resourceID, RECONCILIATIONS + "failed", metadata, "exception",
incrementCounter(ResourceID.fromResource(resource), RECONCILIATIONS + "failed", metadata,
"exception",
cause.getClass().getSimpleName());
}

public <T extends Map<?, ?>> T monitorSizeOf(T map, String name) {
return registry.gaugeMapSize(PREFIX + name + ".size", Collections.emptyList(), map);
}

public static List<Tag> gvkTags(Class<? extends HasMetadata> resourceClass) {
final var gvk = GroupVersionKind.gvkFor(resourceClass);
return List.of(Tag.of("group", gvk.group), Tag.of("version", gvk.version),
Tag.of("kind", gvk.kind));
}

private void incrementCounter(ResourceID id, String counterName, Map<String, Object> metadata,
String... additionalTags) {
final var additionalTagsNb =
Expand Down
2 changes: 1 addition & 1 deletion operator-framework-bom/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

<groupId>io.javaoperatorsdk</groupId>
<artifactId>operator-framework-bom</artifactId>
<version>4.1.3-SNAPSHOT</version>
<version>4.2.0-SNAPSHOT</version>
<name>Operator SDK - Bill of Materials</name>
<packaging>pom</packaging>
<description>Java SDK for implementing Kubernetes operators</description>
Expand Down
2 changes: 1 addition & 1 deletion operator-framework-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>io.javaoperatorsdk</groupId>
<artifactId>java-operator-sdk</artifactId>
<version>4.1.3-SNAPSHOT</version>
<version>4.2.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ public Operator(KubernetesClient kubernetesClient, ConfigurationService configur
}

/** Adds a shutdown hook that automatically calls {@link #stop()} when the app shuts down. */
@Deprecated(forRemoval = true)
public void installShutdownHook() {
if (!leaderElectionManager.isLeaderElectionEnabled()) {
Runtime.getRuntime().addShutdownHook(new Thread(this::stop));
Expand All @@ -89,12 +90,11 @@ public KubernetesClient getKubernetesClient() {
* where there is no obvious entrypoint to the application which can trigger the injection process
* and start the cluster monitoring processes.
*/
public void start() {
public synchronized void start() {
try {
if (started) {
return;
}
started = true;
controllerManager.shouldStart();
final var version = ConfigurationServiceProvider.instance().getVersion();
log.info(
Expand All @@ -110,6 +110,7 @@ public void start() {
// the leader election would start subsequently the processor if on
controllerManager.start(!leaderElectionManager.isLeaderElectionEnabled());
leaderElectionManager.start();
started = true;
} catch (Exception e) {
log.error("Error starting operator", e);
stop();
Expand Down Expand Up @@ -216,4 +217,11 @@ public int getRegisteredControllersNumber() {
return controllerManager.size();
}

public RuntimeInfo getRuntimeInfo() {
return new RuntimeInfo(this);
}

boolean isStarted() {
return started;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,12 @@
import io.fabric8.kubernetes.api.model.HasMetadata;
import io.javaoperatorsdk.operator.api.config.ControllerConfiguration;
import io.javaoperatorsdk.operator.api.config.NamespaceChangeable;
import io.javaoperatorsdk.operator.health.ControllerHealthInfo;

public interface RegisteredController<P extends HasMetadata> extends NamespaceChangeable {

ControllerConfiguration<P> getConfiguration();

ControllerHealthInfo getControllerHealthInfo();

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
package io.javaoperatorsdk.operator;

import java.util.*;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import io.javaoperatorsdk.operator.health.EventSourceHealthIndicator;
import io.javaoperatorsdk.operator.health.InformerWrappingEventSourceHealthIndicator;

/**
* RuntimeInfo in general is available when operator is fully started. You can use "isStarted" to
* check that.
*/
@SuppressWarnings("rawtypes")
public class RuntimeInfo {

private static final Logger log = LoggerFactory.getLogger(RuntimeInfo.class);

private final Set<RegisteredController> registeredControllers;
private final Operator operator;

public RuntimeInfo(Operator operator) {
this.registeredControllers = operator.getRegisteredControllers();
this.operator = operator;
}

public boolean isStarted() {
return operator.isStarted();
}

public Set<RegisteredController> getRegisteredControllers() {
checkIfStarted();
return registeredControllers;
}

private void checkIfStarted() {
if (!isStarted()) {
log.warn(
"Operator not started yet while accessing runtime info, this might lead to an unreliable behavior");
}
}

public boolean allEventSourcesAreHealthy() {
checkIfStarted();
return registeredControllers.stream()
.filter(rc -> !rc.getControllerHealthInfo().unhealthyEventSources().isEmpty())
.findFirst().isEmpty();
}

/**
* @return Aggregated Map with controller related event sources.
*/

public Map<String, Map<String, EventSourceHealthIndicator>> unhealthyEventSources() {
checkIfStarted();
Map<String, Map<String, EventSourceHealthIndicator>> res = new HashMap<>();
for (var rc : registeredControllers) {
res.put(rc.getConfiguration().getName(),
rc.getControllerHealthInfo().unhealthyEventSources());
}
return res;
}

/**
* @return Aggregated Map with controller related event sources that wraps an informer. Thus,
* either a
* {@link io.javaoperatorsdk.operator.processing.event.source.controller.ControllerResourceEventSource}
* or an
* {@link io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource}.
*/
public Map<String, Map<String, InformerWrappingEventSourceHealthIndicator>> unhealthyInformerWrappingEventSourceHealthIndicator() {
checkIfStarted();
Map<String, Map<String, InformerWrappingEventSourceHealthIndicator>> res = new HashMap<>();
for (var rc : registeredControllers) {
res.put(rc.getConfiguration().getName(), rc.getControllerHealthInfo()
.unhealthyInformerEventSourceHealthIndicators());
}
return res;
}
}
Loading