Description
I'm working on custom controller where I noticed this issue. After about 5min of inactivity i.e. no updates in the api server, the controller stops receiving watch events from the api server. So any add/delete on the custom resource after the 5min of inactivity on api server, is not reconciled by the controller .
On further debugging, what I've seen is that ReflectorRunnable.watchHandler which should be constantly reading from the response stream seems to be stuck. The Receiving resourceVersion
log from ReflectorRunnable doesn't show up. Also, kubectl get <customresource> -w
sees the updates as expected.
The minimal repo is here kubetest.zip and I'm able to reproduce this consistently. I tried running this using both java 8 and java 11, with the same result as above. I got a jstack dump when the issue occurred, but couldn't notice any issue, the reflector thread was in runnable state.
"controller-reflector-com.XXXXX-0" #23 prio=5 os_prio=0 cpu=39.77ms elapsed=986.61s allocated=1080K defined_classes=39 tid=0x00007f19b856c800 nid=0x1bbe9 runnable [0x00007f19b5ff2000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(java.base@11.0.10/Native Method)
at java.net.SocketInputStream.socketRead(java.base@11.0.10/SocketInputStream.java:115)
at java.net.SocketInputStream.read(java.base@11.0.10/SocketInputStream.java:168)
at java.net.SocketInputStream.read(java.base@11.0.10/SocketInputStream.java:140)
at sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.10/SSLSocketInputRecord.java:478)
at sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.10/SSLSocketInputRecord.java:472)
at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(java.base@11.0.10/SSLSocketInputRecord.java:70)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(java.base@11.0.10/SSLSocketImpl.java:1354)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(java.base@11.0.10/SSLSocketImpl.java:963)
at okio.Okio$2.read(Okio.java:140)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
at okio.RealBufferedSource.request(RealBufferedSource.java:72)
at okio.RealBufferedSource.require(RealBufferedSource.java:65)
at okio.RealBufferedSource.readHexadecimalUnsignedLong(RealBufferedSource.java:307)
at okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.readChunkSize(Http1ExchangeCodec.java:492)
at okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.read(Http1ExchangeCodec.java:471)
at okhttp3.internal.connection.Exchange$ResponseBodySource.read(Exchange.java:286)
at okio.RealBufferedSource.exhausted(RealBufferedSource.java:61)
at io.kubernetes.client.util.Watch.hasNext(Watch.java:179)
at io.kubernetes.client.informer.cache.ReflectorRunnable.watchHandler(ReflectorRunnable.java:160)
at io.kubernetes.client.informer.cache.ReflectorRunnable.run(ReflectorRunnable.java:108)
at io.kubernetes.client.informer.cache.Controller$$Lambda$126/0x000000080069a440.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.10/Executors.java:515)
at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.10/FutureTask.java:305)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.10/ScheduledThreadPoolExecutor.java:305)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.10/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.10/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.10/Thread.java:834)
Locked ownable synchronizers:
- <0x00000007655823d8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
- <0x00000007656801a0> (a java.util.concurrent.ThreadPoolExecutor$Worker)