Description
Feature Request
Working on the Flink Kubernetes Operator (https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/), and we would like to extend the operator with probes detecting the health of the deployment.
We have observed cases recently where even though the operator pod was running, it could not do its job due to missing rolebindings or misconfigured dynamic namespaces.
What did you do?
Deployed the Flink Kubernetes Operator
What did you expect to see?
Currently they just see that the operator itself is running, but the deployments / jobs are not created as expected.
What did you see instead? Under which circumstances?
I would like to help the users detect the issue that the Flink Kubernetes Operator is not working correctly from status of the operator
Environment
Not sure about the exact environment ATM
Will try to collect the info
Java operator version: 3.0.3
Possible Solution
We are thinking about implementing probes (liveliness/readiness/statup)