Skip to content

ReflectorRunnable.watchHandler seems to be stuck after about 5min of inactivity #1578

Closed
@karunasagark

Description

@karunasagark

I'm working on custom controller where I noticed this issue. After about 5min of inactivity i.e. no updates in the api server, the controller stops receiving watch events from the api server. So any add/delete on the custom resource after the 5min of inactivity on api server, is not reconciled by the controller .

On further debugging, what I've seen is that ReflectorRunnable.watchHandler which should be constantly reading from the response stream seems to be stuck. The Receiving resourceVersion log from ReflectorRunnable doesn't show up. Also, kubectl get <customresource> -w sees the updates as expected.

The minimal repo is here kubetest.zip and I'm able to reproduce this consistently. I tried running this using both java 8 and java 11, with the same result as above. I got a jstack dump when the issue occurred, but couldn't notice any issue, the reflector thread was in runnable state.

"controller-reflector-com.XXXXX-0" #23 prio=5 os_prio=0 cpu=39.77ms elapsed=986.61s allocated=1080K defined_classes=39 tid=0x00007f19b856c800 nid=0x1bbe9 runnable  [0x00007f19b5ff2000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(java.base@11.0.10/Native Method)
        at java.net.SocketInputStream.socketRead(java.base@11.0.10/SocketInputStream.java:115)
        at java.net.SocketInputStream.read(java.base@11.0.10/SocketInputStream.java:168)
        at java.net.SocketInputStream.read(java.base@11.0.10/SocketInputStream.java:140)
        at sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.10/SSLSocketInputRecord.java:478)
        at sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.10/SSLSocketInputRecord.java:472)
        at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(java.base@11.0.10/SSLSocketInputRecord.java:70)
        at sun.security.ssl.SSLSocketImpl.readApplicationRecord(java.base@11.0.10/SSLSocketImpl.java:1354)
        at sun.security.ssl.SSLSocketImpl$AppInputStream.read(java.base@11.0.10/SSLSocketImpl.java:963)
        at okio.Okio$2.read(Okio.java:140)
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
        at okio.RealBufferedSource.request(RealBufferedSource.java:72)
        at okio.RealBufferedSource.require(RealBufferedSource.java:65)
        at okio.RealBufferedSource.readHexadecimalUnsignedLong(RealBufferedSource.java:307)
        at okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.readChunkSize(Http1ExchangeCodec.java:492)
        at okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.read(Http1ExchangeCodec.java:471)
        at okhttp3.internal.connection.Exchange$ResponseBodySource.read(Exchange.java:286)
        at okio.RealBufferedSource.exhausted(RealBufferedSource.java:61)
        at io.kubernetes.client.util.Watch.hasNext(Watch.java:179)
        at io.kubernetes.client.informer.cache.ReflectorRunnable.watchHandler(ReflectorRunnable.java:160)
        at io.kubernetes.client.informer.cache.ReflectorRunnable.run(ReflectorRunnable.java:108)
        at io.kubernetes.client.informer.cache.Controller$$Lambda$126/0x000000080069a440.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.10/Executors.java:515)
        at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.10/FutureTask.java:305)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.10/ScheduledThreadPoolExecutor.java:305)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.10/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.10/ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(java.base@11.0.10/Thread.java:834)

   Locked ownable synchronizers:
        - <0x00000007655823d8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        - <0x00000007656801a0> (a java.util.concurrent.ThreadPoolExecutor$Worker)

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions