Skip to content

Deadlock issue at startup while using micrometer export #17765

Closed
@Liodene

Description

@Liodene

Hello everyone,

We came across a deadlock at startup on one of our applications related to Micrometer/Actuator metrics autoconfiguration.

For our business needs, I had to re-create an OpenTSDB metric exporter (since it's not bundled anymore with Spring Boot 2). I got inspired from the previous OpenTSDB exporter and current metric exporter (like Influx). The only specificity is that I wanted to validate my ConfigurationProperties class with javax.validation, which was where the problem begun.

Here is the thread dump of the deadlock

 =============================
 "localhost-startStop-1":
  waiting to lock monitor 0x00007fbe78e3d468 (object 0x00000000f7d22858, a java.lang.Object),
  which is held by "Service Thread"
 "Service Thread":
  waiting to lock monitor 0x00007fbe78e3b788 (object 0x00000000f6be5020, a java.util.concurrent.ConcurrentHashMap),
  which is held by "localhost-startStop-1"
 p),
 Java stack information for the threads listed above:
 ===================================================
 "localhost-startStop-1":
 #011at io.micrometer.core.instrument.MeterRegistry.getOrCreateMeter(MeterRegistry.java:558)
 #011- waiting to lock <0x00000000f7d22858> (a java.lang.Object)
 #011at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:528)
 #011at io.micrometer.core.instrument.MeterRegistry.timer(MeterRegistry.java:268)
 #011at io.micrometer.core.instrument.Timer$Builder.register(Timer.java:464)
 #011at com.zaxxer.hikari.metrics.micrometer.MicrometerMetricsTracker.<init>(MicrometerMetricsTracker.java:78)
 #011at com.zaxxer.hikari.metrics.micrometer.MicrometerMetricsTrackerFactory.create(MicrometerMetricsTrackerFactory.java:20)
 #011at com.zaxxer.hikari.pool.HikariPool.setMetricsTrackerFactory(HikariPool.java:302)
 #011at com.zaxxer.hikari.HikariDataSource.setMetricsTrackerFactory(HikariDataSource.java:263)
 #011at org.springframework.boot.actuate.autoconfigure.metrics.jdbc.DataSourcePoolMetricsAutoConfiguration$HikariDataSourceMetricsConfiguration.bindMetricsRegistryToHikariDataSource(DataSourcePoolMetricsAutoConfiguration.java:126)
 #011at org.springframework.boot.actuate.autoconfigure.metrics.jdbc.DataSourcePoolMetricsAutoConfiguration$HikariDataSourceMetricsConfiguration.bindMetricsRegistryToHikariDataSources(DataSourcePoolMetricsAutoConfiguration.java:118)
 #011at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 #011at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 #011at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 #011at java.lang.reflect.Method.invoke(Method.java:498)
 #011at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredMethodElement.inject(AutowiredAnnotationBeanPostProcessor.java:708)
 #011at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:90)
 #011at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessProperties(AutowiredAnnotationBeanPostProcessor.java:374)
 #011at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1411)
 #011at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:592)
 #011at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:515)
 #011at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:320)
 #011at org.springframework.beans.factory.support.AbstractBeanFactory$$Lambda$129/789436751.getObject(Unknown Source)
 #011at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
 #011- locked <0x00000000f6be5020> (a java.util.concurrent.ConcurrentHashMap)
 #011at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:318)
 #011at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199)
 #011at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:845)
 #011at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:877)
 #011at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:549)
 #011- locked <0x00000000f6be4978> (a java.lang.Object)
 #011at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:140)
 #011at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:742)
 #011at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:389)
 #011at org.springframework.boot.SpringApplication.run(SpringApplication.java:311)
 #011at org.springframework.boot.web.servlet.support.SpringBootServletInitializer.run(SpringBootServletInitializer.java:151)
 #011at org.springframework.boot.web.servlet.support.SpringBootServletInitializer.createRootApplicationContext(SpringBootServletInitializer.java:131)
 #011at org.springframework.boot.web.servlet.support.SpringBootServletInitializer.onStartup(SpringBootServletInitializer.java:91)
 #011at org.springframework.web.SpringServletContainerInitializer.onStartup(SpringServletContainerInitializer.java:171)
 #011at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5154)
 #011- locked <0x00000000f570b1e0> (a org.apache.catalina.core.StandardContext)
 #011at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
 #011- locked <0x00000000f570b1e0> (a org.apache.catalina.core.StandardContext)
 #011at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:743)
 #011at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:719)
 #011at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:714)
 #011at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:970)
 #011at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1841)
 #011at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 #011at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 #011at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 #011at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 #011at java.lang.Thread.run(Thread.java:748)
 "Service Thread":
 #011at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:179)
 #011- waiting to lock <0x00000000f6be5020> (a java.util.concurrent.ConcurrentHashMap)
 #011at org.springframework.beans.factory.support.AbstractBeanFactory.isTypeMatch(AbstractBeanFactory.java:493)
 #011at org.springframework.beans.factory.support.DefaultListableBeanFactory.doGetBeanNamesForType(DefaultListableBeanFactory.java:518)
 #011at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBeanNamesForType(DefaultListableBeanFactory.java:481)
 #011at org.springframework.beans.factory.BeanFactoryUtils.beanNamesForTypeIncludingAncestors(BeanFactoryUtils.java:227)
 #011at org.springframework.beans.factory.support.DefaultListableBeanFactory.findAutowireCandidates(DefaultListableBeanFactory.java:1415)
 #011at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1214)
 #011at org.springframework.context.annotation.ContextAnnotationAutowireCandidateResolver$1.getTarget(ContextAnnotationAutowireCandidateResolver.java:90)
 #011at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:192)
 #011at com.sun.proxy.$Proxy94.forExecutables(Unknown Source)
 #011at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:99)
 #011at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
 #011at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
 #011at my.package.OpenTsdbProperties$$EnhancerBySpringCGLIB$$62b04d96.getStep(<generated>)
 #011at org.springframework.boot.actuate.autoconfigure.metrics.export.properties.StepRegistryPropertiesConfigAdapter$$Lambda$395/784926852.apply(Unknown Source)
 #011at org.springframework.boot.actuate.autoconfigure.metrics.export.properties.PropertiesConfigAdapter.get(PropertiesConfigAdapter.java:53)
 #011at org.springframework.boot.actuate.autoconfigure.metrics.export.properties.StepRegistryPropertiesConfigAdapter.step(StepRegistryPropertiesConfigAdapter.java:51)
 #011at io.micrometer.core.instrument.step.StepMeterRegistry.defaultHistogramConfig(StepMeterRegistry.java:94)
 #011at io.micrometer.core.instrument.MeterRegistry.lambda$timer$2(MeterRegistry.java:270)
 #011at io.micrometer.core.instrument.MeterRegistry$$Lambda$447/1190935252.apply(Unknown Source)
 #011at io.micrometer.core.instrument.MeterRegistry.getOrCreateMeter(MeterRegistry.java:575)
 #011- locked <0x00000000f7d22858> (a java.lang.Object)
 #011at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:528)
 #011at io.micrometer.core.instrument.MeterRegistry.timer(MeterRegistry.java:268)
 #011at io.micrometer.core.instrument.Timer$Builder.register(Timer.java:464)
 #011at io.micrometer.core.instrument.binder.jvm.JvmGcMetrics.lambda$bindTo$0(JvmGcMetrics.java:162)
 #011at io.micrometer.core.instrument.binder.jvm.JvmGcMetrics$$Lambda$413/692860962.handleNotification(Unknown Source)
 #011at sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:156)
 #011at sun.management.GarbageCollectorImpl.createGCNotification(GarbageCollectorImpl.java:143)
 56)
 Found 1 deadlock.

On the thread "localhost-startStop-1", which is the main thread:

  • start the application
  • lock on the singleton cache
  • autoconfigure actuator metrics
  • create meter and try to lock on the meter cache

On the thread "Service Thread":

  • GC is running which is instrumented by micrometer
  • try to export metrics
  • lock on meter cache
  • start instantiation of my custom exporter
  • try to validate my properties
  • try to lock on the singleton cache

Autoconfiguration class:

@Configuration
@AutoConfigureBefore({ CompositeMeterRegistryAutoConfiguration.class,
                       SimpleMetricsExportAutoConfiguration.class })
@AutoConfigureAfter(MetricsAutoConfiguration.class)
@ConditionalOnProperty(prefix = "management.metrics.export.opentsdb", name = "enabled", matchIfMissing = false)
@EnableConfigurationProperties(OpenTsdbProperties.class)
public class OpenTsdbMetricsExportAutoConfiguration {

    @Bean
    @ConditionalOnMissingBean
    public OpenTsdbConfig openTsdbConfig(OpenTsdbProperties openTsdbProperties) {
        return new OpenTsdbPropertiesConfigAdapter(openTsdbProperties);
    }

    @Bean
    @ConditionalOnMissingBean
    public OpenTsdbMeterRegistry openTsdbMeterRegistry(OpenTsdbConfig openTsdbConfig,
                                                       Clock clock) {
        return new OpenTsdbMeterRegistry(openTsdbConfig, clock);
    }
}

Properties class:

@ConfigurationProperties(prefix = "management.metrics.export.opentsdb")
@Validated
public class OpenTsdbProperties extends StepRegistryProperties {

    /**
     * URI of the OpenTSDB server.
     */
    @NotNull
    private String uri ;

    /**
     * Hostname to use in the tags.
     */
    @NotNull
    private String hostname;

    /**
     * Whether to enable GZIP compression of metrics batches published to OpenTSDB.
     */
    private boolean compressed = true;

    public String getUri() {
        return this.uri;
    }

    public void setUri(String uri) {
        this.uri = uri;
    }

    public boolean isCompressed() {
        return this.compressed;
    }

    public void setCompressed(boolean compressed) {
        this.compressed = compressed;
    }

    public String getHostname() {
        return hostname;
    }

    public void setHostname(String hostname) {
        this.hostname = hostname;
    }
}

Regarding the reproducibility, the deadlock window is really short in our case:

  • happened on 95% of starup on one application
  • happened on 5% of startup on another one with similar setup
  • changing logging level changed the behaviour since it changed the timings. In our case it was failing in 95% of startup, in 10-15% of startup in DEBUG and never in TRACE
  • we were not able to recreate those condition locally

I know this sounds like an unprobable series of event which leads to a one time problem, but I guess it could have been something else than a bean validation, and something else than a custom exporter.

Isn't there an issue with the order in which the micrometer elements are configured? It seems odd to start instrumenting elements and reporting metrics before the system is fully up and running. My spring-muggle opinion is that MeterRegistry should be configured before Meter/Metrics.

Version used:

  • Spring-boot: 2.1.6-RELEASE
  • micrometer: 1.1.5
  • hibernate-validator: 6.0.17.Final
  • javax validation-api: 2.0.1-Final

In the meantime, we fixed the issue by doing the field validation manually.

Metadata

Metadata

Assignees

Labels

for: external-projectFor an external project and not something we can fix

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions