Skip to content

Implement StringLiteralExprSyntax/contentValue. #1508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 8, 2023

Conversation

NikolaiRuhe
Copy link
Contributor

SwiftSyntax parses string literals as StringLiteralExprSyntax. When the literal is static (does not contain interpolation segments) it can be useful to access the actual string value from the syntax tree.

The source-accurate representation in StringLiteralExprSyntax differs from run-time value:

  • Escaped character sequences (\n, \u{1F600}) are not evaluated.
  • Some strings are split into segments (i.e. multiline strings).

StringLiteralExprSyntax/contentValue performs the necessary unescaping and concatenation of segments.

@NikolaiRuhe NikolaiRuhe requested a review from ahoppen as a code owner April 9, 2023 16:59
///
/// Returns nil if the literal contains interpolation segments.

public var contentValue: String? {
Copy link
Contributor

@CodaFi CodaFi Apr 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be very very careful about trafficking in Strings derived from the syntax tree. The documentation should be very clear about the limitations of this API with respect to the caveats in its implementation and the source fidelity concerns laid out here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that contentValue sounds a little like the actual string content of the literal. What do you think about unescapedValue? I am not terribly happy with the name and am open to better suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had trouble coming up with a proper name.

I considered a couple of candidates like stringValue, runtimeValue, content, etc. The final decision for contentValue came from some existing code in ConvenienceInitializers.swift which basically does the opposite: Building a StringLiteralExprSyntax from an unescaped String value. This initializer uses the label content: for the value. Just content seemed to conflicting, though.

I'm also open for better names. I have the feeling unescapedValue only matches partly, leaving out the concatenation part, and is difficult to find. But I'll follow your advice here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main reason why I don’t like contentValue is that content is already a term that’s used in SwiftSyntax, e.g. StringLiteralExprSyntax has a contentLength property that’s completely unrelated to contentValue. And that’s pretty confusing to me.

Alternative idea: What do you think about representedLiteralValue?

@CodaFi
Copy link
Contributor

CodaFi commented Apr 9, 2023

I'm curious why this in SwiftParser and not, say, SwiftSyntaxBuilder.

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking great to me. A very clever use of the lever to lex the escaped string characters.

@CodaFi IIUC this lives in SwiftParser because it access the lever, which isn’t public (and shouldn’t be). We could define this as SPI in SwiftParser and then offer it as public API in SwiftSyntaxBuilder but I think it shouldn’t live in SwiftSyntaxBuilder because it’s not about building a syntax tree. So, I think SwiftParser is indeed the correct location for this property.

///
/// Returns nil if the literal contains interpolation segments.

public var contentValue: String? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that contentValue sounds a little like the actual string content of the literal. What do you think about unescapedValue? I am not terribly happy with the name and am open to better suggestions.


/// Helper class to find string literals in this source file
class StringLiteralCollector: SyntaxVisitor {
var locationConverter: SourceLocationConverter? = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only used temporarily during walk and can be freed afterwards. Not sure if this is a good reason.

@NikolaiRuhe
Copy link
Contributor Author

Thanks @CodaFi and @ahoppen for the feedback🙏. I’ll reply to it and fix code when I get back from vacation in a week.

@NikolaiRuhe
Copy link
Contributor Author

I'm curious why this in SwiftParser and not, say, SwiftSyntaxBuilder.

My understanding of SwiftSyntax' module structure is limited. But I think it could actually live in the plain SwiftSyntax module. The functionality is neither about parsing nor building but really just a derived value from the syntax node. It could potentially be useful from manually initialized nodes.

The reason I put it in SwiftParser is because I'm reusing private API from the parser module: Lexer/lexCharacterInStringLiteral.

@NikolaiRuhe
Copy link
Contributor Author

I'm not sure about the proper flow: Shall I squash the two commits? Or shall I even rebase onto current main? How does the PR proceed from here?

Thanks, again!

/// resolved.
///
/// Returns nil if the literal contains interpolation segments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous newline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

///
/// Most tests are expressed by a single function call that compares the
/// run-time String value against the parsed node's `contentValue`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous newline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

///
/// Returns nil if the literal contains interpolation segments.

public var contentValue: String? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main reason why I don’t like contentValue is that content is already a term that’s used in SwiftSyntax, e.g. StringLiteralExprSyntax has a contentLength property that’s completely unrelated to contentValue. And that’s pretty confusing to me.

Alternative idea: What do you think about representedLiteralValue?

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision. I think the only thing remaining is the naming of the method. But as they say: The 2 hardest problems in computer science are cache invalidation, naming and off-by-1 errors 😆

Regarding the process: There’s no need to rebase unless there are merge conflicts. Since we don’t squash commits on merge, it would be great if you could squash your commits. Once we’ve decided on the name, I’ll run CI and if that passes, I’ll merge the PR.

@NikolaiRuhe NikolaiRuhe force-pushed the string-literal-value branch from 11ec8d5 to 19c7184 Compare May 7, 2023 08:45
@NikolaiRuhe
Copy link
Contributor Author

Yes, naming indeed can be very difficult—and is at the same time so important. It's a bit like an app's icon on the app store. The first thing you see and it should be able to explain what it does in a glimpse. Nobody reads the docs 😢.

In that sense I do like representedLiteralValue. It works unchanged on IntegerLiteralExprSyntax and FloatLiteralExprSyntax (upcoming PR?).

Thanks for the help and the review!

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Thank you. Let’s merge it.

@ahoppen
Copy link
Member

ahoppen commented May 8, 2023

@swift-ci Please test

@ahoppen ahoppen enabled auto-merge May 8, 2023 16:59
@kimdv
Copy link
Contributor

kimdv commented May 8, 2023

@swift-ci please test windows

@ahoppen ahoppen merged commit da4add8 into swiftlang:main May 8, 2023
@NikolaiRuhe NikolaiRuhe deleted the string-literal-value branch May 9, 2023 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants