-
Notifications
You must be signed in to change notification settings - Fork 887
Override globs #1189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Override globs #1189
Conversation
This lets us say things like '*.*_ksuid' should be of type 'github.com/segmentio/ksuid.KSUID', instead of having to list every single column in every single table. We can even say '*.*.*_ksuid' for every column, in every table, in every schema. Now, ideally we would do a type replace on 'binary(20)', however that gets turned into a char for at least the mysql code, which means that we never get to see 'binary(20)', and so can't do a replacement on it. This makes use of the github.com/gobwas/glob package for the globbing, and it pretty much Just Works the way you expect it to.
This doesn't get used, and so we can get rid of it instead of adapting it for globs.
Fairly straight forward, same changes as for the other code generators.
Simple merge conflicts.
This modifies the existing overrides_go_types test cases to include tests for override globs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding tests and implementing this in a backwards-compatible way. A posted a few comments in the PR, but there's a fundamental issue with taking this approach. It has to do with the bug found in #957. In PostgreSQL (and I'm sure other databases), it's legal to have a *
character in a table or column name.
kyle=# create table foo ("bar*foo" int);
CREATE TABLE
kyle=# \d+ foo;
Table "public.foo"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------+---------+-----------+----------+---------+---------+--------------+-------------
bar*foo | integer | | | | plain | |
kyle=# create table "foo*bar" (bar int);
CREATE TABLE
kyle=# \d "foo*bar";
Table "public.foo*bar"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
bar | integer | | |
That means we need to use a different syntax. I'm not sure what that syntax is, but we should figure it out before we do any more working on the code side of things.
(I put the reply in the wrong place, moving it here.) On the syntax question, I think we can solve this with documentation, and I'm not sure that we can reasonably solve it otherwise. Specifically, \* will match a literal * instead of acting as a wildcard. (Assuming that I did my escapes properly for the markdown.) We could, admittedly, pick a different character with other special meaning to SQL that would generally prevent it from being a valid schema, table, or column character, but given my surprise by #957, I'd have to do some digging to see if something like _ was valid SQL if properly escaped. |
@zelch That's a good point about escaping the wildcard. I doubt there are many people out there using tables or columns with Looking over https://github.com/gobwas/glob, I'm actually less inclined to merge this pull request as-is. The syntax and patterns supported by this package are very powerful. I personally don't want sqlc to now have to support this many types of patterns from now on. We'll need to find another package that implements a much smaller set of patterns with the |
What are your thoughts on https://github.com/zyedidia/glob as an alternate library? It's still supporting the full 'standard' glob patterns mind you, but it does seem to get escapes right. |
No more glob.Glob in the FQN. We're isolating the matching in a different place.
This supports the following: * matches anything zero or more times. ? matches a single character. \* matches a literal *. \? matches a literal ?. Invalid escapes will cause an error.
So the Override struct no longer uses a FQN, and instead has split out *Match items for ColumnName, TableCatalog, TableSchema, and TableRel. There is a new *Override.Matches function that indicates if a given *ast.TableName (with a default schema argument) matches the override. And the code that used to use sameTableName in the python and golang code generators now uses the Matches function.
We're doing our own matching thing now.
Alright, so I had a little time, so I rewrote the glob support. The tests are unchanged and still function. With that said, there are some improvements, we no longer rely on an outside library for globbing, I used https://github.com/zyedidia/glob for inspiration, but the matching code is now in internal/config/match.go, and it supports the following:
Further, all of the matching types are isolated in Override, with no more changes to the FQN type. And there is now a function:
I think this addresses all of the concerns, and it's definitely a lot cleaner of a change compared to how it was previously. |
This looks great. I may change a few small things after merge, but this is ready to go. Thanks for being willing to iterate on the code. |
This is a revisit of the now closed (for perfectly good reasons) pull request #862 .
Now with test coverage.
(I wasn't expecting to get a chance to address this anytime soon, but since I spent most of they day buried in sqlc and pingcap/parser code, spending the time was much easier to justify.)
So, I'm going to start with the use case.
Our MariaDB database has a lot of KSUIDs, enough that I would be better off writing a tool to automate building the override statements than doing it by hand.
It would be nice if we could just override on type, replace 'binary(20)' with 'github.com/segmentio/ksuid.KSUID' and call it a day.
Sadly, at least the mysql parser turns 'binary(20)' into 'char' well before the override code gets it. In fact, it does this well before any of the sqlc code gets it.
That leaves doing it by column name, but as mentioned above... There are quite a lot of them.
On the other hand, our naming convention is at least vaguely straight forward. The columns all end with _ksuid, so we can automatically detect them.
But why write rules to do this, and have a huge override table, when we can just add globbing support to sqlc?
And so I did that instead. :)
This passes all of the tests, and is reasonably straight forward.