Skip to content

Support CONVERT expressions #1048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 20, 2023
Merged

Support CONVERT expressions #1048

merged 1 commit into from
Nov 20, 2023

Conversation

lovasoa
Copy link
Contributor

@lovasoa lovasoa commented Nov 18, 2023

fixes #1047

adds support for the following CONVERT syntaxes:

  • CONVERT('héhé' USING utf8mb4) (MySQL, Postgres)
  • CONVERT('héhé', CHAR CHARACTER SET utf8mb4) (MySQL)
  • CONVERT(DECIMAL(10, 5), 42) (MSSQL) - the type comes first

@coveralls
Copy link

coveralls commented Nov 18, 2023

Pull Request Test Coverage Report for Build 6915102482

  • 59 of 69 (85.51%) changed or added relevant lines in 7 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.007%) to 87.708%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/dialect/mod.rs 1 2 50.0%
src/dialect/mssql.rs 1 2 50.0%
src/dialect/redshift.rs 0 2 0.0%
src/ast/mod.rs 14 17 82.35%
src/parser/mod.rs 33 36 91.67%
Files with Coverage Reduction New Missed Lines %
src/ast/mod.rs 1 78.94%
Totals Coverage Status
Change from base Build 6914235970: -0.007%
Covered Lines: 17845
Relevant Lines: 20346

💛 - Coveralls

@lovasoa lovasoa force-pushed the convert-expressions branch from b222845 to e5399b3 Compare November 18, 2023 16:36
fixes apache#1047

adds support for the following CONVERT syntaxes:
 - `CONVERT('héhé' USING utf8mb4)` (MySQL, Postgres)
 - `CONVERT('héhé', CHAR CHARACTER SET utf8mb4)` (MySQL)
 - `CONVERT(DECIMAL(10, 5), 42)` (MSSQL) - the type comes first
@tobyhede
Copy link
Contributor

tobyhede commented Nov 19, 2023

The "simple" function syntax already works (eg CONVERT(DECIMAL(10, 5), 42))

You can possibly make the code simpler by keep this behaviour and adding the more complex CONVERT statement parsing if it is detected.

Adding something like the following in the parser.

if self.parse_keyword(Keyword::USING) {
   // build CONVERT exp
} else {
  // default to existing parse_function
}

This approach would remove the need for the check for function order and mean CONVERT struct would not need so many Optional types as you only ever use it for the "complex" CONVERT statements.

@lovasoa
Copy link
Contributor Author

lovasoa commented Nov 20, 2023

@tobyhede , we cannot do a if self.parse_keyword(Keyword::USING) { after having already parsed the first argument as a data type. In the MySQL syntax, the first argument is an expression, not a data type.

@lovasoa
Copy link
Contributor Author

lovasoa commented Nov 20, 2023

The "simple" function syntax already works (eg CONVERT(DECIMAL(10, 5), 42))

Does it ? I think CONVERT(DECIMAL(10, 5), 42) is parsed incorrectly today. DECIMAL(10, 5) would be parsed as an expression (a function call) instead of a data type.

@alamb
Copy link
Contributor

alamb commented Nov 20, 2023

Does it ? I think CONVERT(DECIMAL(10, 5), 42) is parsed incorrectly today. DECIMAL(10, 5) would be parsed as an expression (a function call) instead of a data type.

I agree this is what it seems to do:

$ echo "select CONVERT(DECIMAL(10, 5), 42)" > /tmp/foo.sql
$ cargo run --example cli -- /tmp/foo.sql
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/examples/cli /tmp/foo.sql`
Parsing from file '/tmp/foo.sql' using GenericDialect
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] Parsing sql 'select CONVERT(DECIMAL(10, 5), 42)
'...
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Value(Number("10", false))
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: Comma, location: Location { line: 1, column: 26 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: , 1: 5 2: )
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Value(Number("5", false))
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: RParen, location: Location { line: 1, column: 29 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: ) 1: , 2: 42
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Function(Function { name: ObjectName([Ident { value: "DECIMAL", quote_style: None }]), args: [Unnamed(Expr(Value(Number("10", false)))), Unnamed(Expr(Value(Number("5", false))))], filter: None, null_treatment: None, over: None, distinct: false, special: false, order_by: [] })
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: Comma, location: Location { line: 1, column: 30 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: , 1: 42 2: )
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Value(Number("42", false))
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: RParen, location: Location { line: 1, column: 34 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: ) 1: EOF 2: EOF
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Function(Function { name: ObjectName([Ident { value: "CONVERT", quote_style: None }]), args: [Unnamed(Expr(Function(Function { name: ObjectName([Ident { value: "DECIMAL", quote_style: None }]), args: [Unnamed(Expr(Value(Number("10", false)))), Unnamed(Expr(Value(Number("5", false))))], filter: None, null_treatment: None, over: None, distinct: false, special: false, order_by: [] }))), Unnamed(Expr(Value(Number("42", false))))], filter: None, null_treatment: None, over: None, distinct: false, special: false, order_by: [] })
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: EOF, location: Location { line: 0, column: 0 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: EOF 1: EOF 2: EOF
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
Round-trip:
'SELECT CONVERT(DECIMAL(10, 5), 42)'
Parse results:
[
    Query(
        Query {
            with: None,
            body: Select(
                Select {
                    distinct: None,
                    top: None,
                    projection: [
                        UnnamedExpr(
                            Function(
                                Function {
                                    name: ObjectName(
                                        [
                                            Ident {
                                                value: "CONVERT",
                                                quote_style: None,
                                            },
                                        ],
                                    ),
                                    args: [
                                        Unnamed(
                                            Expr(
                                                Function(
                                                    Function {
                                                        name: ObjectName(
                                                            [
                                                                Ident {
                                                                    value: "DECIMAL",
                                                                    quote_style: None,
                                                                },
                                                            ],
                                                        ),
                                                        args: [
                                                            Unnamed(
                                                                Expr(
                                                                    Value(
                                                                        Number(
                                                                            "10",
                                                                            false,
                                                                        ),
                                                                    ),
                                                                ),
                                                            ),
                                                            Unnamed(
                                                                Expr(
                                                                    Value(
                                                                        Number(
                                                                            "5",
                                                                            false,
                                                                        ),
                                                                    ),
                                                                ),
                                                            ),
                                                        ],
                                                        filter: None,
                                                        null_treatment: None,
                                                        over: None,
                                                        distinct: false,
                                                        special: false,
                                                        order_by: [],
                                                    },
                                                ),
                                            ),
                                        ),
                                        Unnamed(
                                            Expr(
                                                Value(
                                                    Number(
                                                        "42",
                                                        false,
                                                    ),
                                                ),
                                            ),
                                        ),
                                    ],
                                    filter: None,
                                    null_treatment: None,
                                    over: None,
                                    distinct: false,
                                    special: false,
                                    order_by: [],
                                },
                            ),
                        ),
                    ],
                    into: None,
                    from: [],
                    lateral_views: [],
                    selection: None,
                    group_by: Expressions(
                        [],
                    ),
                    cluster_by: [],
                    distribute_by: [],
                    sort_by: [],
                    having: None,
                    named_window: [],
                    qualify: None,
                },
            ),
            order_by: [],
            limit: None,
            limit_by: [],
            offset: None,
            fetch: None,
            locks: [],
            for_clause: None,
        },
    ),
]

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you for the contribution @lovasoa

/// The target data type
data_type: Option<DataType>,
/// The target character encoding
charset: Option<ObjectName>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Postgres calls this "conversion name" but that appears to be an ObjectName so 👍

https://www.postgresql.org/docs/8.2/functions-string.html

@@ -35,6 +35,12 @@ impl Dialect for MsSqlDialect {
|| ch == '_'
}

/// SQL Server has `CONVERT(type, value)` instead of `CONVERT(value, type)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯

@alamb alamb changed the title add support for CONVERT expressions Support CONVERT expressions Nov 20, 2023
@alamb alamb merged commit c905ee0 into apache:main Nov 20, 2023
@lovasoa lovasoa deleted the convert-expressions branch November 20, 2023 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support mysql's CONVERT(string USING charset)
4 participants