Skip to content
This repository was archived by the owner on May 28, 2021. It is now read-only.

Workaround for sanitizing JSON coming from MySQL Shell #132

Merged
merged 1 commit into from
Jun 26, 2018

Conversation

gianlucaborello
Copy link

@gianlucaborello gianlucaborello commented Jun 12, 2018

Hi

I am testing the operator ability to effectively recover a cluster in case of different types of failures, and I found a corner case that doesn't seem to be handled.

To replicate:

  • Start a simple cluster with single primary, 3 replicas total
  • Kill 2 of the 3 replicas without deleting the underlying data for the pods. This can be achieved in several ways, in my test I started by using emptydir volumes, so I simply killed the two mysql instances with kill -9 to preserve the data
  • The previous step causes the cluster to lose quorum, and the situation never recovers, even if the other 2 mysql instances come up nearly immediately

The first error I get in the log is:

E0611 20:27:07.737705       1 cluster_manager.go:114] Failed to get the cluster status: invalid character '\'' in string escape code
decoding cluster status output: "{\"clusterName\": \"MySQLCluster\", \"defaultReplicaSet\": {\"name\": \"default\", \"primary\": \"mysql-0.mysql:3306\", \"ssl\": \"REQUIRED\", \"status\": \"NO_QUORUM\", \"statusText\": \"Cluster has no quorum as visible from \\'mysql-1.mysql:3306\\' and cannot process write transactions. 2 members are not active\", \"topology\": {\"mysql-0.mysql:3306\": {\"address\": \"mysql-0.mysql:3306\", \"mode\": \"R/W\", \"readReplicas\": {}, \"role\": \"HA\", \"status\": \"UNREACHABLE\"}, \"mysql-1.mysql:3306\": {\"address\": \"mysql-1.mysql:3306\", \"mode\": \"R/O\", \"readReplicas\": {}, \"role\": \"HA\", \"status\": \"ONLINE\"}, \"mysql-2.mysql:3306\": {\"address\": \"mysql-2.mysql:3306\", \"mode\": \"R/O\", \"readReplicas\": {}, \"role\": \"HA\", \"status\": \"UNREACHABLE\"}}}, \"groupInformationSourceMember\": \"mysql://root@mysql-1.mysql:3306\"}\n"

It seems like MySQL shell is generating non-valid JSON since it escapes the single quotes in Cluster has no quorum as visible from '127.0.0.1:3306', which is not an escapable character in JSON (https://www.json.org/, right table).

The issue can be replicated also manually by running mysqlsh --py:

> dba.get_cluster().status()
{
    "clusterName": "MySQLCluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "mysql-0.mysql:3306",
        "ssl": "REQUIRED",
        "status": "NO_QUORUM",
        "statusText": "Cluster has no quorum as visible from '127.0.0.1:3306' and cannot process write transactions. 2 members are not active",
        "topology": {
            "mysql-0.mysql:3306": {

Notice how printing the object correctly renders a valid json, however when using print like in the code (which is useful for other reasons, such as stripping \n that show up during the pretty rendering), it gets mangled and the string ends up containing \':

> s = dba.get_cluster().status()
> print s
{"clusterName": "MySQLCluster", "defaultReplicaSet": {"name": "default", "primary": "mysql-0.mysql:3306", "ssl": "REQUIRED", "status": "NO_QUORUM", "statusText": "Cluster has no quorum as visible from \'127.0.0.1:3306\' and cannot process write transactions. 2 members are not active", "topology": {"mysql-0.mysql:3306": {"address": "mysql-0.mysql:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "UNREACHABLE"}, "mysql-1.mysql:3306": {"address": "mysql-1.mysql:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "UNREACHABLE"}, "mysql-2.mysql:3306": {"address": "mysql-2.mysql:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE"}}}, "groupInformationSourceMember": "mysql://root@127.0.0.1:3306"}
>
> import json
> json.loads(str(s))
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid \escape: line 1 column 202 (char 201)

Which is the same error we get on the agent side. Notice that the standard dict Python implementation doesn't have this behavior, it seems to be specific to the types used inside MySQL shell:

> type(s)
<type 'shell.Dict'>

Now, I think fully understanding the issue is beyond my paygrade, but digging into the MySQL Shell code I can find the definition of the shell.Dict custom type:

https://github.com/mysql/mysql-shell/blob/8.0.11/mysqlshdk/scripting/python_map_wrapper.cc#L364

Inside this custom type, the defined __str__ function, dict_printable, seems to indeed always escape the single quote, leading to a non valid JSON.

https://github.com/mysql/mysql-shell/blob/8.0.11/mysqlshdk/scripting/types.cc#L1234

I'm not sure what the right solution is (unfortunately shell.Dict doesn't seem serializable so we can't use json.dumps on it, which would definitely do the right encoding), but in the meantime I'm just submitting a dirty RFC patch that simply escapes such character.

At the moment I'm hesitant to sign the Oracle CLA, I make this commit under the Apache License 2.0, feel free to use it as is, or also just report the changes in your own commit, I don't mind. Please let me know if it's going to be a problem.

Thanks

@prydie
Copy link

prydie commented Jun 12, 2018

Hi @gianlucaborello,

Thanks for the contribution and the awesome analysis!

I think your solution is probably the best we can do at this point. I have, however, directed the MySQL shell team here so hopefully, we can get it fixed at source and remove the workaround down the line.

I'd really encourage you to sign the CLA if at all possible as we really want to build a community around the MySQL Operator and a big part of that is having people's contributions attributed and valued. I certainly appreciate, however, that we have a long way to go to earn developer trust and the process is rather arduous so do let me know if you don't want to sign and I'll open up a follow-on PR to incorporate your work.

@prydie prydie added the oracle-cla: no Contributor has not yet signed the Oracle Contributor Licence Agreement label Jun 13, 2018
include invalid \' sequences.

Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Copy link

@prydie prydie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once CLA processed 👍

@prydie
Copy link

prydie commented Jun 18, 2018

@prydie
Copy link

prydie commented Jun 18, 2018

I've opened https://bugs.mysql.com/bug.php?id=91304 and the mysqlsh team are triaging.

@gianlucaborello
Copy link
Author

FYI, signed and sent the CLA today.

@owainlewis owainlewis added oracle-cla: yes Contributor has signed the Oracle Contributor Licence Agreement and removed oracle-cla: no Contributor has not yet signed the Oracle Contributor Licence Agreement labels Jun 26, 2018
@owainlewis owainlewis self-assigned this Jun 26, 2018
@owainlewis owainlewis self-requested a review June 26, 2018 10:07
@owainlewis owainlewis merged commit 5e6eb90 into oracle:master Jun 26, 2018
@gianlucaborello gianlucaborello deleted the fix-invalid-json branch June 26, 2018 15:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
oracle-cla: yes Contributor has signed the Oracle Contributor Licence Agreement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants