Random crashing

**Describe the bug**
I have 4 pgo clusters created and running, randomly after some time they will crash and fail to come back, I've tried restarting the clusters and even killing all the pods in the pgo namespace which just results in a complaint of multi-attached pvc's, so far the only way to fix this is delete the clusters and restore from backups.

Storage for both primary and replication is rook-ceph-block, backrest is S3, I can't for the life of me figure out what is causing the crash as the logs them self do not indicate anything that jumps out at me, please see below:

```
2020-12-20 02:21:09,854 INFO: no action.  i am the leader with the lock
2020-12-20 02:21:19,847 INFO: Lock owner: keycloak-8445bc9877-997sw; I am keycloak-8445bc9877-997sw
2020-12-20 02:21:19,916 INFO: no action.  i am the leader with the lock
2020-12-20 02:21:29,847 INFO: Lock owner: keycloak-8445bc9877-997sw; I am keycloak-8445bc9877-997sw
2020-12-20 02:21:29,872 INFO: no action.  i am the leader with the lock
2020-12-20 02:21:39,847 WARNING: Postgresql is not running.
2020-12-20 02:21:39,847 INFO: Lock owner: keycloak-8445bc9877-997sw; I am keycloak-8445bc9877-997sw
2020-12-20 02:21:39,907 INFO: Reaped pid=45567, exit status=0
2020-12-20 02:21:39,908 INFO: pg_controldata:
```

**Please tell us about your environment:**

* Operating System: Ubuntu 20.04.1 LTS
* Where is this running ( Local, Cloud Provider): Local
* Storage being used (NFS, Hostpath, Gluster, etc): S3 and rook-ceph-block
* Container Image Tag: centos7-12.5-4.5.1
* PostgreSQL Version: 12.5
* Platform (Docker, Kubernetes, OpenShift): Kubernetes 1.20.1
* Platform Version: 4.5.1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Random crashing #2138

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Random crashing #2138

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions