Skip to content

Make CodeGeneration a lot less string-based #1672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 25, 2023

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented May 16, 2023

Because of its Python legacy the CodeGeneration tool has been heavily string-based. This PR fixes that at least to some degree.

The main motivating factors here were:

  • Instead of using strings to specify the node type of a child node, define a SyntaxNodeKind enum that contains a case for each syntax node. This way you can be sure that you’re not referring to a non-existent syntax node (like we did for e.g. InOutToken, which we forgot to change to .token("inout"))
  • Refactor Node so that it contains two initializers: One for collections and one for layout nodes instead of having one initializer that has a bunch of optional and defaulted arguments.
  • Change a bunch of properties from returning a String to returning a TypeSyntax or TokenSyntax so we don‘t need to use the raw: interpolation style for them.

In general, the API of Node and SyntaxNodeKind is what I’m happy with now. The other files can still do with some cleanup.

All the changes resulted in nearly no functionality changes of the generated code, so I think this PR doesn’t need a super thorough review, just skimming over it should be sufficient. It’s really repetitive after all.

@ahoppen ahoppen requested review from bnbarham and kimdv May 16, 2023 18:07
Copy link
Contributor

@kimdv kimdv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So nice! WUHU! 😍

Have been thinking a lot on doing this, but you were faster.

@@ -123,19 +118,26 @@ public class Child {

/// Whether this child has syntax kind `UnexpectedNodes`.
public var isUnexpectedNodes: Bool {
syntaxKind == "UnexpectedNodes"
switch kind {
case .collection(kind: .unexpectedNodes, collectionElementName: _):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to just omit collectionElementName

nameForDiagnostics: nil,
description: "A CodeBlockItem is any Syntax node that appears on its own line inside a CodeBlock.",
kind: "Syntax",
documentation: "A CodeBlockItem is any Syntax node that appears on its own line inside a CodeBlock.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this, but we could add ticks around CodeBlockItem and CodeBlock (and properly also many other places) to add doc linking

/// `AccessorList` represents a collection of `AccessorDeclSyntax`
/// `AccessorListSyntax` represents a collection of `AccessorDeclSyntax`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs don't seem particularly useful to me. I'd expect them on the type itself rather than the extension (and even there... fairly dubious about their value).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won’t argue with you about that. I hope that we’ll write proper documentation soon and then these will just automatically go away.

Because of its Python legacy the CodeGeneration tool has been heavily string-based. This PR fixes that at least to some degree.

The main motivating factors here were:
- Instead of using strings to specify the node type of a child node, define a `SyntaxNodeKind` enum that contains a case for each syntax node. This way you can be sure that you’re not referring to a non-existent syntax node (like we did for e.g. `InOutToken`, which we forgot to change to `.token("inout")`)
- Refactor `Node` so that it contains two initializers: One for collections and one for layout nodes instead of having one initializer that has a bunch of optional and defaulted arguments.
- Change a bunch of properties from returning a `String` to returning a `TypeSyntax` or `TokenSyntax` so we don‘t need to use the `raw:` interpolation style for them.

In general, the API of `Node` and `SyntaxNodeKind` is what I’m happy with now. The other files can still do with some cleanup.

All the changes resulted in nearly no functionaly changes of the generated code.
@ahoppen ahoppen force-pushed the ahoppen/code-gen-not-string-based branch from e708871 to cf6ff40 Compare May 25, 2023 03:23
@ahoppen
Copy link
Member Author

ahoppen commented May 25, 2023

@swift-ci Please test

@ahoppen ahoppen enabled auto-merge May 25, 2023 03:23
@ahoppen ahoppen merged commit 4c0ad1b into swiftlang:main May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants