Skip to content

PR for Micrometer Integration - #64 #486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions operator-framework-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@
<description>Core framework for implementing Kubernetes operators</description>
<packaging>jar</packaging>

<properties>
<java.version>11</java.version>
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
<micrometer-core.version>1.7.3</micrometer-core.version>
</properties>

<build>
<plugins>
<plugin>
Expand Down Expand Up @@ -92,5 +99,10 @@
<artifactId>log4j-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
<version>${micrometer-core.version}</version>
</dependency>
</dependencies>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
package io.javaoperatorsdk.operator;

import io.fabric8.kubernetes.client.CustomResource;
import io.javaoperatorsdk.operator.api.Context;
import io.javaoperatorsdk.operator.api.DeleteControl;
import io.javaoperatorsdk.operator.api.ResourceController;
import io.javaoperatorsdk.operator.api.UpdateControl;
import io.javaoperatorsdk.operator.api.config.ControllerConfiguration;
import io.micrometer.core.instrument.*;
import io.micrometer.core.instrument.distribution.DistributionStatisticConfig;
import io.micrometer.core.instrument.distribution.pause.PauseDetector;
import io.micrometer.core.instrument.noop.*;
import java.util.concurrent.TimeUnit;
import java.util.function.ToDoubleFunction;
import java.util.function.ToLongFunction;

public class Metrics {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this whole construct while wrapping the logic in a Metrics class is little bit smelly, not sure if there is a better way. But will take a look soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I wrote a separate class without wrapping it. @metacosm - Chris asked to have it like above wrapping the logic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with wrapping the logic, @csviri? More than wrapping, it's about measuring things that make sense from the operator's perspective as opposed to calling more metrics-oriented methods. Another aspect is that all metrics code is in a single spot which makes it easier to figure out what's measured and change it if needed.

Copy link
Collaborator

@csviri csviri Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the metrics code is in one place that is good. That we are mixing the application logic with metrics is not that nice. Those should be as loosely coupled as possible IMHO. And also seamlessly if possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't say I have a better answer for this now, just mentioning that design principle. So something to think about.

public static final Metrics NOOP = new Metrics(new NoopMeterRegistry(Clock.SYSTEM));
private final MeterRegistry registry;

public Metrics(MeterRegistry registry) {
this.registry = registry;
}

public <R extends CustomResource> UpdateControl<R> timeControllerCreateOrUpdate(
ResourceController<R> controller,
ControllerConfiguration<R> configuration,
R resource,
Context<R> context) {
final var name = configuration.getName();
final var timer =
Timer.builder("operator.sdk.controllers.execution.createorupdate")
.tags("controller", name)
.publishPercentiles(0.3, 0.5, 0.95)
.publishPercentileHistogram()
.register(registry);
try {
final var result = timer.record(() -> controller.createOrUpdateResource(resource, context));
String successType = "cr";
if (result.isUpdateStatusSubResource()) {
successType = "status";
}
if (result.isUpdateCustomResourceAndStatusSubResource()) {
successType = "both";
}
registry
.counter(
"operator.sdk.controllers.execution.success", "controller", name, "type", successType)
.increment();
return result;
} catch (Exception e) {
registry
.counter(
"operator.sdk.controllers.execution.failure",
"controller",
name,
"exception",
e.getClass().getSimpleName())
.increment();
throw e;
}
}

public DeleteControl timeControllerDelete(
ResourceController controller,
ControllerConfiguration configuration,
CustomResource resource,
Context context) {
final var name = configuration.getName();
final var timer =
Timer.builder("operator.sdk.controllers.execution.delete")
.tags("controller", name)
.publishPercentiles(0.3, 0.5, 0.95)
.publishPercentileHistogram()
.register(registry);
try {
final var result = timer.record(() -> controller.deleteResource(resource, context));
String successType = "notDelete";
if (result == DeleteControl.DEFAULT_DELETE) {
successType = "delete";
}
registry
.counter(
"operator.sdk.controllers.execution.success", "controller", name, "type", successType)
.increment();
return result;
} catch (Exception e) {
registry
.counter(
"operator.sdk.controllers.execution.failure",
"controller",
name,
"exception",
e.getClass().getSimpleName())
.increment();
throw e;
}
}

public void timeControllerRetry() {

registry
.counter(
"operator.sdk.retry.on.exception", "retry", "retryCounter", "type",
"retryException")
.increment();

}

public void timeControllerEvents() {

registry
.counter(
"operator.sdk.total.events.received", "events", "totalEvents", "type",
"eventsReceived")
.increment();

}

public static class NoopMeterRegistry extends MeterRegistry {
public NoopMeterRegistry(Clock clock) {
super(clock);
}

@Override
protected <T> Gauge newGauge(Meter.Id id, T t, ToDoubleFunction<T> toDoubleFunction) {
return new NoopGauge(id);
}

@Override
protected Counter newCounter(Meter.Id id) {
return new NoopCounter(id);
}

@Override
protected Timer newTimer(
Meter.Id id,
DistributionStatisticConfig distributionStatisticConfig,
PauseDetector pauseDetector) {
return new NoopTimer(id);
}

@Override
protected DistributionSummary newDistributionSummary(
Meter.Id id, DistributionStatisticConfig distributionStatisticConfig, double v) {
return new NoopDistributionSummary(id);
}

@Override
protected Meter newMeter(Meter.Id id, Meter.Type type, Iterable<Measurement> iterable) {
return new NoopMeter(id);
}

@Override
protected <T> FunctionTimer newFunctionTimer(
Meter.Id id,
T t,
ToLongFunction<T> toLongFunction,
ToDoubleFunction<T> toDoubleFunction,
TimeUnit timeUnit) {
return new NoopFunctionTimer(id);
}

@Override
protected <T> FunctionCounter newFunctionCounter(
Meter.Id id, T t, ToDoubleFunction<T> toDoubleFunction) {
return new NoopFunctionCounter(id);
}

@Override
protected TimeUnit getBaseTimeUnit() {
return TimeUnit.SECONDS;
}

@Override
protected DistributionStatisticConfig defaultHistogramConfig() {
return DistributionStatisticConfig.NONE;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -25,21 +25,28 @@ public class Operator implements AutoCloseable {
private final Object lock;
private final List<ControllerRef> controllers;
private volatile boolean started;
private final Metrics metrics;

public Operator(KubernetesClient k8sClient, ConfigurationService configurationService) {
public Operator(
KubernetesClient k8sClient, ConfigurationService configurationService, Metrics metrics) {
this.k8sClient = k8sClient;
this.configurationService = configurationService;
this.closeables = new ArrayList<>();
this.lock = new Object();
this.controllers = new ArrayList<>();
this.started = false;
this.metrics = metrics;
}

/** Adds a shutdown hook that automatically calls {@link #close()} when the app shuts down. */
public void installShutdownHook() {
Runtime.getRuntime().addShutdownHook(new Thread(this::close));
}

public Operator(KubernetesClient k8sClient, ConfigurationService configurationService) {
this(k8sClient, configurationService, Metrics.NOOP);
}

public KubernetesClient getKubernetesClient() {
return k8sClient;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import com.fasterxml.jackson.databind.ObjectMapper;
import io.fabric8.kubernetes.client.Config;
import io.fabric8.kubernetes.client.CustomResource;
import io.javaoperatorsdk.operator.Metrics;
import io.javaoperatorsdk.operator.api.ResourceController;
import java.util.Set;

Expand Down Expand Up @@ -93,4 +94,8 @@ default ObjectMapper getObjectMapper() {
default int getTerminationTimeoutSeconds() {
return DEFAULT_TERMINATION_TIMEOUT_SECONDS;
}

default Metrics getMetrics() {
return Metrics.NOOP;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import io.fabric8.kubernetes.client.CustomResource;
import io.fabric8.kubernetes.client.dsl.MixedOperation;
import io.javaoperatorsdk.operator.Metrics;
import io.javaoperatorsdk.operator.api.ResourceController;
import io.javaoperatorsdk.operator.api.RetryInfo;
import io.javaoperatorsdk.operator.api.config.ConfigurationService;
Expand All @@ -16,6 +17,7 @@
import io.javaoperatorsdk.operator.processing.retry.GenericRetry;
import io.javaoperatorsdk.operator.processing.retry.Retry;
import io.javaoperatorsdk.operator.processing.retry.RetryExecution;
import io.micrometer.core.instrument.Clock;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
Expand Down Expand Up @@ -46,6 +48,7 @@ public class DefaultEventHandler implements EventHandler {
private final int terminationTimeout;
private final ReentrantLock lock = new ReentrantLock();
private DefaultEventSourceManager eventSourceManager;
private ControllerConfiguration configuration;

public DefaultEventHandler(
ResourceController controller, ControllerConfiguration configuration, MixedOperation client) {
Expand All @@ -54,28 +57,28 @@ public DefaultEventHandler(
configuration.getName(),
GenericRetry.fromConfiguration(configuration.getRetryConfiguration()),
configuration.getConfigurationService().concurrentReconciliationThreads(),
configuration.getConfigurationService().getTerminationTimeoutSeconds());
configuration.getConfigurationService().getTerminationTimeoutSeconds(), configuration);
}

DefaultEventHandler(
EventDispatcher eventDispatcher,
String relatedControllerName,
Retry retry,
int concurrentReconciliationThreads) {
int concurrentReconciliationThreads, ControllerConfiguration configuration) {
this(
eventDispatcher,
relatedControllerName,
retry,
concurrentReconciliationThreads,
ConfigurationService.DEFAULT_TERMINATION_TIMEOUT_SECONDS);
ConfigurationService.DEFAULT_TERMINATION_TIMEOUT_SECONDS, configuration);
}

private DefaultEventHandler(
EventDispatcher eventDispatcher,
String relatedControllerName,
Retry retry,
int concurrentReconciliationThreads,
int terminationTimeout) {
int terminationTimeout, ControllerConfiguration configuration) {
this.eventDispatcher = eventDispatcher;
this.retry = retry;
this.controllerName = relatedControllerName;
Expand All @@ -85,6 +88,7 @@ private DefaultEventHandler(
new ScheduledThreadPoolExecutor(
concurrentReconciliationThreads,
runnable -> new Thread(runnable, "EventHandler-" + relatedControllerName));
this.configuration = configuration;
}

@Override
Expand Down Expand Up @@ -113,6 +117,10 @@ public void handleEvent(Event event) {
final Predicate<CustomResource> selector = event.getCustomResourcesSelector();
for (String uid : eventSourceManager.getLatestResourceUids(selector)) {
eventBuffer.addEvent(uid, event);
configuration
.getConfigurationService()
.getMetrics()
.timeControllerEvents();
executeBufferedEvents(uid);
}
} finally {
Expand Down Expand Up @@ -162,6 +170,10 @@ void eventProcessingFinished(

if (retry != null && postExecutionControl.exceptionDuringExecution()) {
handleRetryOnException(executionScope);
configuration
.getConfigurationService()
.getMetrics()
.timeControllerRetry();
return;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,12 @@ private PostExecutionControl handleCreateOrUpdate(
getName(resource),
getVersion(resource),
executionScope);
UpdateControl<R> updateControl = controller.createOrUpdateResource(resource, context);

UpdateControl<R> updateControl =
configuration
.getConfigurationService()
.getMetrics()
.timeControllerCreateOrUpdate(controller, configuration, resource, context);
R updatedCustomResource = null;
if (updateControl.isUpdateCustomResourceAndStatusSubResource()) {
updatedCustomResource = updateCustomResource(updateControl.getCustomResource());
Expand Down Expand Up @@ -153,8 +158,12 @@ private PostExecutionControl handleDelete(R resource, Context<R> context) {
"Executing delete for resource: {} with version: {}",
getName(resource),
getVersion(resource));
// todo: this is be executed in a try-catch statement, in case this fails
DeleteControl deleteControl = controller.deleteResource(resource, context);

DeleteControl deleteControl =
configuration
.getConfigurationService()
.getMetrics()
.timeControllerDelete(controller, configuration, resource, context);
final var useFinalizer = configuration.useFinalizer();
if (useFinalizer) {
if (deleteControl == DeleteControl.DEFAULT_DELETE
Expand Down
Loading