Skip to content

On network loss for any one of statefulstate, Operator is throwing error. #2620

Open
@oumkale

Description

@oumkale

Please ensure you do the following when reporting a bug:

I was injecting network loss chaos on master database.
On post Network Loss Operator is stuck at below error and unable to create a new cluster or delete the current one.

  • Not able to scale replicas

  • Provide a concise description of what the bug is.

  • Provide information about your environment.

  • Provide clear steps to reproduce the bug.

  • Attach applicable logs. Please do not attach screenshots showing logs unless you are unable to copy and paste the log data.

  • Ensure any code / output examples are properly formatted for legibility.

Note that some logs needed to troubleshoot may be found in the /pgdata/<CLUSTERNAME>/pg_log directory on your Postgres instance.

An incomplete bug report can lead to delays in resolving the issue or the closing of a ticket, so please be as detailed as possible.

If you are looking for general support, please view the support page for where you can ask questions.

Thanks for reporting the issue, we're looking forward to helping you!

Overview

Add a concise description of what the bug is.

Environment

Please provide the following details:

  • Platform: (Kubernetes, EKS)
  • Platform Version: (e.g. 1.20.3, 4.7.0)
  • PGO Image Tag: (e.g. ubi8-5.0.0-0)
  • Postgres Version (e.g. 13)
  • Storage: (e.g. hostpath, nfs, or the name of your storage class)

Operator image details: Tried with a couple of other tags also

images:
- name: postgres-operator
  newName: registry.developers.crunchydata.com/crunchydata/postgres-operator
  newTag: ubi8-5.0.0-0

EXPECTED

  1. It should not be stuck at a given error, Need to achieve a healthy state again

ACTUAL

  1. Operator is stuck at below error and unable to create a new cluster or delete the current one.

Logs

raise KubernetesError('Kubernetes API is not responding properly')\npatroni.dcs.kubernetes.KubernetesError: 'Kubernetes API is not responding properly'\n" stdout= version=5.0.1-0
time="2021-08-19T11:21:58Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:275" func="postgrescluster.(*Reconciler).Reconcile" name=keycloakdb namespace=postgres reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.0.1-0
time="2021-08-19T11:21:58Z" level=error msg="Reconciler error" error="command terminated with exit code 1" file="internal/controller/postgrescluster/patroni.go:147" func="postgrescluster.(*Reconciler).reconcilePatroniDynamicConfiguration" name=keycloakdb namespace=postgres reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.0.1-0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions