Skip to content
This repository was archived by the owner on May 28, 2021. It is now read-only.
This repository was archived by the owner on May 28, 2021. It is now read-only.

mysql container in server pod crashes intermittently after deployment. #259

Open
@d0x2f

Description

@d0x2f

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT

Versions

MySQL Operator Version:
helm chart master (c98210b)
Values.image.tag 0.3.0

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:08:12Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.5-gke.5", GitCommit:"2c44750044d8aeeb6b51386ddb9c274ff0beb50b", GitTreeState:"clean", BuildDate:"2019-02-01T23:53:25Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: GKE

What happened?

When producing a cluster stateful set, often one of the pods enters a crash loop with the following logs:

error-log.txt

What you expected to happen?

All pods to start successfully

How to reproduce it (as minimally and precisely as possible)?

Here's my cluster.yaml:

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: mysql-config
data:
  my.cnf: |-
    [mysqld]
    default_authentication_plugin=mysql_native_password
---
apiVersion: mysql.oracle.com/v1alpha1
kind: Cluster
metadata:
  name: alchemy-database
spec:
  members: 3
  version: 8.0.12
  config:
    name: mysql-config
  volumeClaimTemplate:
    metadata:
      name: data
    spec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
  name: alchemy-database-router
  labels:
    app: alchemy-database-router
spec:
  ports:
    - name: read-write
      port: 6446
      targetPort: 6446
      protocol: TCP
    - name: read-only
      port: 6447
      targetPort: 6447
      protocol: TCP
  selector:
    app: alchemy-database-router
  type: ClusterIP
  clusterIP: None
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: alchemy-database-router
  labels:
    app: alchemy-database-router
spec:
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: alchemy-database-router
    spec:
      containers:
      - name: mysqlrouter
        image: mysql/mysql-router:8.0.12
        env:
        - name: MYSQL_PASSWORD
          valueFrom:
            secretKeyRef:
              name: alchemy-database-root-password
              key: password
        - name: MYSQL_USER
          value: root
        - name: MYSQL_PORT
          value: "3306"
        - name: MYSQL_HOST
          value: alchemy-database
        - name: MYSQL_INNODB_NUM_MEMBERS
          value: "3"
        command:
        - "/bin/bash"
        - "-cx"
        - "exec /run.sh mysqlrouter"
        ports:
          - containerPort: 6446
          - containerPort: 6447

Anything else we need to know?

mysql-operator is installed into the same namespace as the above yaml, "alchemy".

This yaml is based on some of the examples provided in this repo, however I've changed the access mode on the volume claims to ReadWriteOnce because ReadWriteMany isn't supported on GKE out of the box. Perhaps ReadWriteMany is required for mysql-operator?

By following the link at the end of the crash log I found the line:

The preceding means that normally you should not get corrupted tables unless one of the following happens:

  • Some external program is manipulating data files or index files at the same time as mysqld without locking the table properly.

Also if one pod crashes it'll continue to crash every time it's restarted, but the others remain running with no issue.

All of this makes me think it might be something to do with the access mode but I was under the impression that each pod mounts it's own PV and so ReadWriteOnce should be sufficient.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions