Description
- Which image of the operator are you using? e.g. ghcr.io/zalando/postgres-operator:v1.14.0
- Type of issue? - question
- Spilo? - https://github.com/zalando/spilo/releases/tag/3.0-p1
Hi All,
We are facing a container restart issue when we use the 1.14.0 operator. This issue does not occur in version 1.9.0. The postgres/postgres exporter container is restarting during any one of the operator sync intervals (we are not facing this issue in every sync).
STS event :
Events:
Type Reason Age From Message
Normal Killing 43m kubelet Container postgres-exporter definition changed, will be restarted
Normal Pulling 43m kubelet Pulling image "docker.com/wrouesnel/postgres_exporter:latest@sha256:54bd3ba6bc39a9da2bf382667db4dc249c96e4cfc837dafe91d6cc7d362829e0"
Normal Created 43m (x2 over 3d22h) kubelet Created container: postgres-exporter
Normal Started 43m (x2 over 3d22h) kubelet Started container postgres-exporter
Normal Pulled 43m kubelet Successfully pulled image "docker.com/wrouesnel/postgres_exporter:latest@sha256:54bd3ba6bc39a9da2bf382667db4dc249c96e4cfc837dafe91d6cc7d362829e0" in 1.071s (1.071s including waiting). Image size: 33164884 bytes.
State: Running
Started: Mon, 21 Apr 2025 10:37:10 +0530
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Thu, 17 Apr 2025 13:09:10 +0530
Finished: Mon, 21 Apr 2025 10:37:09 +0530
Ready: True
We are also noticing the pods are recreating with the reason pod not yet restarted due to lazy update.
Operator Log :
time="2025-04-21T06:43:04Z" level=debug msg="syncing pod disruption budgets" cluster-name=pg-pgspilotest3/pg-pgspilotest3 pkg=cluster worker=2
time="2025-04-21T06:43:04Z" level=debug msg="syncing roles" cluster-name=pg-pgspilotest3/pg-pgspilotest3 pkg=cluster worker=2
time="2025-04-21T06:43:11Z" level=debug msg="syncing Patroni config" cluster-name=pg-pgspilotest1/pg-pgspilotest1 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="making GET http request: http://192.168.14.210:8008/config" cluster-name=pg-pgspilotest1/pg-pgspilotest1 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="making GET http request: http://192.168.14.210:8008/patroni" cluster-name=pg-pgspilotest1/pg-pgspilotest1 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="syncing pod disruption budgets" cluster-name=pg-pgspilotest1/pg-pgspilotest1 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="syncing roles" cluster-name=pg-pgspilotest1/pg-pgspilotest1 pkg=cluster
time="2025-04-21T06:43:11Z" level=info msg="mark rolling update annotation for pg-pgspilotest2-1: reason pod not yet restarted due to lazy update" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="syncing Patroni config" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="making GET http request: http://192.168.38.25:8008/config" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="making GET http request: http://192.168.14.243:8008/config" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="making GET http request: http://192.168.38.25:8008/patroni" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="making GET http request: http://192.168.14.243:8008/patroni" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=info msg="performing rolling update" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=info msg="there are 2 pods in the cluster to recreate" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster
time="2025-04-21T06:43:11Z" level=debug msg="subscribing to pod "pg-pgspilotest2/pg-pgspilotest2-0"" cluster-name=pg-pgspilotest2/pg-pgspilotest2 pkg=cluster