Description
This issue is to cover more related issues and come up with a design how to handle them, see:
- [bug] Operator becoming non-functional after transient RBAC changes #1419
- Allow operator to start even if not all watched namespaces are accessible #1405
- Support for K8S Probes #1412
Summary
Informers are by definition not working if related RBAC is not configured or miss configured.
- Either when an operator starts up and there are no proper permissions or
- when running and some permissions are revoked.
This is true both for informers for primary and secondary resources.
When a permission is revoked what happens is that Informer (from fabri8 client) will try retry few times but eventually just stops. (TODO verify this behavior, also configurability)
Note that framework however supports dynamically changing watched namespaces, so it can again happen a new namespace added but there are no permissions. It might be desirable to have the operator functioning on the former namespaces.
There are multiple ways to solve this:
- Just stop/restart the operator when an informer is not started properly
- Try to reconnect the informer (infinitely)
- Expose metrics regarding the health of informers so readyness probes can be added that eventually kill the operator.
@andreaTP propsed that having a callback function might be useful too.