Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String interpolation #1071

Closed
10 tasks done
degory opened this issue Feb 19, 2024 · 0 comments
Closed
10 tasks done

String interpolation #1071

degory opened this issue Feb 19, 2024 · 0 comments

Comments

@degory
Copy link
Owner

degory commented Feb 19, 2024

Support string interpolation

TODO

Essential

  • Tokenize string interpolations including transitions between string literal fragments and interpolated expressions
  • Tokenize formats for interpolated expressions, introduced with : and terminated with }
  • Handle arbitrary nesting of interpolations
  • A newline character within an interpolation should cancel out of it immediately with an error, and also completely unwind tokenization and parsing of interpolations, returning the outermost string interpolation parse tree.
  • Allow escaping of { and } by doubling them
  • Add a string interpolation parse tree and update all visitors to accept it
  • Parse string interpolations to correct parse tree
  • Generate code to concatenate string fragments and interpolated expressions into a string result
  • Format each interpolated value according to its format string, where provided

Optional

  • Use DefaultInterpolatedStringHandler to perform the concatenation of string fragments and interpolated expressions

Details

We're not using { and } for anything, and they're already associated with string interpolation in other languages.

We could in principle use { and } as quotes, as well as for introducing interpolated expressions, however this is not very readable and so could be error prone.

I don't want to further overload `, or require a prefix to " in order to differentiate interpolated strings: once they're available they're likely to be the most common way to format output, more so than string.format or overloads on write_line for example.

So, as the expected most common use case, we want string interpolation to be low friction with a clean syntax. The rarer non-interpolation uses of { and } in strings will instead need to be escaped. To make this less painful, when it is needed, we should support escaping { and } by doubling them - this is both easier to type and easier to read than escaping with `'.

"this is a string {this is an expression} back to the string {another expression "nested string" back to the second expression} back to the original string"
"name is {name}, age is {age}, height is {height} number of limbs is {arms + legs + if head? then 1 else 0 fi}"

The syntax highlighting will need to be updated to support this. Although true recursive syntax highlighting is not possible from a TextMate grammar, we can do three or four levels of nesting, which will likely cover the vast majority of real use.

Some languages allow newlines in interpolated strings and some also disable escapes, but I'm not convinced the benefits outweigh the complexity. I'd prefer a separate syntax to introduce a multi-line string, to avoid incomplete strings causing a cascade of errors and breaking completion and other language extension features in the interpolated expressions.

To parse interpolations we'll want to split the complexity between the tokenizer and the parser. The tokenizer needs to keep track of whether it's in a interpolated string or in an expression within an interpolated string, which it can do just by tracking { and }. The tokenizer can then feed the parser a sequence of enter string interpolation interpolated string segment enter interpolated expression exit interpolated expression exit string interpolation. The parser can then take this linear sequence and recursively build the required parse tree from it.

The simple option for code generation is to translate interpolations into a tree of string + object expressions in the parser. However, the C# compiler uses DefaultInterpolatedStringHandler, which is designed for string interpolations and is presumably more efficient. It uses generic methods though, which we obviously can call but would be quite a bit more work.

degory added a commit that referenced this issue Feb 20, 2024
Enhancements:
- Add initial support for string interpolation (see #1071)

Bugs fixed:
- Can't use ref struct types (partial fix only #1077)
degory added a commit that referenced this issue Feb 20, 2024
Enhancements:
- Add initial support for string interpolation (see #1071)

Bugs fixed:
- Can't use ref struct types (partial fix only #1077)
degory added a commit that referenced this issue Feb 21, 2024
Enhancements:
- Support string interpolation in literals delimited with double quotes (see #1071)

Notes:
- This changes how string literals are interpreted. Existing code with string literals that contain `{` and `}` will need to be updated to quote them. See linked issue for details.

Bumps #minor version
degory added a commit that referenced this issue Feb 21, 2024
Enhancements:
- Support string interpolation in literals delimited with double quotes (see #1071)

Notes:
- This changes how string literals are interpreted. Existing code with string literals that contain `{` and `}` will need to be updated to quote them. See linked issue for details.

Bumps #minor version
degory added a commit that referenced this issue Feb 27, 2024
Enhancements:
- Support alignment and format specifiers for interpolated expressions in string literals (#1071)
- Improved error recovery and reporting for incomplete string interpolations
degory added a commit that referenced this issue Feb 27, 2024
Enhancements:
- Support alignment and format specifiers for interpolated expressions in string literals (#1071)
- Improved error recovery and reporting for incomplete string interpolations
@degory degory closed this as completed Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

1 participant