-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EEP 64: Triple-Quoted Strings #47
Changes from 1 commit
42bb881
21d4fc8
1b7f9f3
7c0a9ce
c4b2911
d212a61
f2919de
5341f1a
5493c00
e04e8c9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,187 @@ | ||
Author: Kiko Fernandez-Reyes <kiko(at)erlang(dot)org> | ||
Status: Draft | ||
Type: Standards Track | ||
Created: 07-Jun-2023 | ||
Erlang-Version: OTP-27 | ||
Post-History: | ||
**** | ||
EEP XXX: Triple-quoted binary strings | ||
---- | ||
|
||
Abstract | ||
======== | ||
|
||
This EEP proposes the introduction of triple-quoted binary-strings syntax and | ||
runtime semantics. The main benefit is to allow multi-line binary strings in an | ||
easy way, similar to other languages, e.g., Elixir. | ||
|
||
Rationale | ||
========= | ||
|
||
One limitation of Erlang today (June 2023) is the ability to write multi-line | ||
string binaries. That said, the main reason to consider this EEP is | ||
[EEP-59][] where documentation attributes can | ||
benefit from multi-line string binaries: the documentation generates Docs chunks | ||
format -- documentation in its binary format-- which saves space and makes the | ||
documentation available to the shell. | ||
|
||
Triple-Quoted Binary-String Design Decisions | ||
--------------------------------------------- | ||
|
||
### Runtime semantics | ||
|
||
Triple-quoted strings should only produce binary-strings. This makes easy to | ||
avoid typing issues with union types, and makes clear that a triple-quoted string | ||
does not need function with guards to test if the produce result is a list or a | ||
binary (this can happen if Erlang adds string interpolation, | ||
[EEP-62][], | ||
[PR-7343][]). | ||
|
||
Triple-quoted binary-strings do not allow string interpolation and, if we were | ||
to accept them, only calls to statically known values should be allowed, e.g., | ||
macros. This design decision forbids the creation of dynamic documentation that | ||
generates code at runtime, e.g., doing an IO call to get a value not known | ||
statically and based on it perform an action (notice that this is allowed when | ||
using string interpolation). | ||
|
||
### Binary-String Design | ||
|
||
To make the usage of binary-string similar to other languages, this EEP proposal | ||
makes the binary-string design decisions similar to the ones for the Elixir language. | ||
|
||
#### Binary-String Start and End | ||
|
||
Binary-strings must start and end with triple-quotes in their own lines: | ||
|
||
*Code Example* | ||
|
||
X = """ | ||
This is the beginning of the triple-quote text | ||
""" | ||
|
||
*Documentation Example* | ||
|
||
-doc """ | ||
Removes double-quotes (") from a given string | ||
""" | ||
remove_double_quotes(X) -> | ||
|
||
#### Binary-Strings Closing of Triple-Quotes | ||
|
||
The closing of the triple-quotes denotes the space (indentation): | ||
|
||
*1. Code Example* | ||
|
||
>> X = """ | ||
This text | ||
has no indentation | ||
""" | ||
|
||
Equivalent to: <<"This text\nhas no indentation\n">> | ||
|
||
*2. Code Example* | ||
|
||
>> Y = """ | ||
This text | ||
has indentation | ||
""" | ||
|
||
Equivalent to: <<" This text\n has indentation\n">> | ||
|
||
*3. Code Example* | ||
|
||
>> Z = """ | ||
This text has | ||
two space indentation | ||
""" | ||
|
||
Equivalent to: <<" This text has\n two space indentation\n">> | ||
|
||
*1. Documentation Example - No indentation* | ||
|
||
-doc """ | ||
Removes double-quotes (") from a given string | ||
""" | ||
remove_double_quotes(X) -> | ||
|
||
*2. Documentation Example - No indentation* | ||
|
||
-doc """ | ||
Removes double-quotes (") from a given string | ||
""" | ||
remove_double_quotes(X) -> | ||
|
||
*3. Documentation Example - Indentation* | ||
|
||
-doc """ | ||
Removes double-quotes (") from a given string | ||
""" | ||
remove_double_quotes(X) -> | ||
|
||
#### Binary-Strings Errors | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps worth mentioning it also errors when there is a
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we are still compatible with existing code, and one can write strings using single quotes, we may have to allow code that for some reason is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps it is worth pushing a deprecation warning to Erlang/OTP 26.x that warns on triple quotes? |
||
|
||
An error should be raised when there is text that starts before the closing of triple-quotes, where | ||
the error makes the code written using triple-quotes to have a uniform style: | ||
|
||
*Code Example* | ||
|
||
>> Err = """ | ||
This shall be an error | ||
""" | ||
|
||
error | ||
|
||
*Documentation Example* | ||
|
||
-doc """ | ||
Removes double-quotes (") from a given string | ||
""" | ||
remove_double_quotes(X) -> | ||
|
||
#### Escaping Newlines in Triple-Quotes | ||
|
||
Escaping new lines using `\`: | ||
|
||
To write `\` inside triple quotes, one needs to use `\\`. [EEP-59][] should | ||
ignore the meaning of `\` if inside a code block. | ||
|
||
*Code Example* | ||
|
||
>> X = """ | ||
This text \ | ||
has no indentation | ||
""" | ||
|
||
Equivalent to: <<"This text has no indentation\n">> | ||
kikofernandez marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense the possibility to comment after
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's ambiguous because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Oh, this is true and makes things tricky.
Oh, using
I think this is easy, I've implemented this now based on your idea of the
I don't know if this is a good idea, but it's possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My concern is that the last example is still ambiguous with escaping the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not understand why you want a comment in a multi line string. We want it for Markdown documentation, and Markdown uses trailing So I do not know if we want newline escape. Or any escape codes at all, for that matter, since Markdown uses There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comment inside the string makes the syntax too complex IMO. It's already possible to put two strings side-by-side to concatenate them, e.g. "foo " % this comment is ignored
"bar" = """
foo \
"""
%% this comment would be ignored too
"""
bar\
""". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sorry @RaimoNiskanen, I commented without thinking in Markdown. |
||
|
||
*Documentation Example* | ||
|
||
-doc """ | ||
Removes double-quotes (") \ | ||
from a given string | ||
""" | ||
remove_double_quotes(X) -> | ||
|
||
[EEP 59]: https://www.erlang.org/eeps/eep-0059 | ||
"EEP 59: Module attributes for documentation" | ||
|
||
[EEP-62]: https://www.erlang.org/eeps/eep-0062 | ||
"String Interpolation Syntax" | ||
|
||
[PR-7343]: https://github.com/erlang/otp/pull/7343 | ||
"Feature: String Interpolation" | ||
|
||
Copyright | ||
========= | ||
RaimoNiskanen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This document is placed in the public domain or under the CC0-1.0-Universal | ||
license, whichever is more permissive. | ||
|
||
[EmacsVar]: <> "Local Variables:" | ||
[EmacsVar]: <> "mode: indented-text" | ||
[EmacsVar]: <> "indent-tabs-mode: nil" | ||
[EmacsVar]: <> "sentence-end-double-space: t" | ||
[EmacsVar]: <> "fill-column: 70" | ||
[EmacsVar]: <> "coding: utf-8" | ||
[EmacsVar]: <> "End:" | ||
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry this may be potentially confusing. Why is
"foo"
a list but"""foo"""
a binary? What if I want to write a long text but as a charlist instead? Here is a potential example. So, even though I would prefer binaries, I believe a more consistent option is to return charlists.Then, for binaries, there are a few options:
Write
<<"""...""">>
Build on EEP 63 to introduce
u"""..."""
However, for documentation in particular, the compiler can convert
-doc """...""".
into binaries, so the distinction may not be terribly importantThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agreeing with José here, I think triple quotes strings should produce strings in erlang unless I closed in a binary context, this would be more consistent with the existing Erlang syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with José as well, I think it is confusing to use
""""
as a binary. Another option for unicode binaries could be to use backticks:Not the most convenient key on some keyboards, but Javascript seems to get away with using it so maybe we can aswell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like using backticks to mean binary. That nicely solves how to write binary literal strings on one line as well. I think we have a winner!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backticks is interesting because it also exists in Prolog. Erlang's syntax comes from Prolog as we all know (which also has single quote for atoms). In SWI Prolog it seems to be configurable how double-quoted and back-quoted strings work: https://www.swi-prolog.org/pldoc/man?section=string and in GNU Prolog similarily.
But, for simple binary strings, isn't a prefix like
b"binary string"
simpler and more aligned with Python, C, Rust, etc. for different kinds of strings?I think
b""
is enough for UTF-8 encoded binaries as long as the source code is UTF-8 encoded, which it normally is. (If we then add triple quotes for charlist strings, thenb"""..."""
seems the natural choice for the binary version.)(Btw C11 uses
u""
for UTF-16 strings,U""
for UTF-32 strings andu8""
for UTF-8, sou""
for UTF-8 would be confusing in this context.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but a regexp string is a different animal where I guess you do not want any escape sequences, and therefore cannot use interpolation. Possibly chosen with a prefix, e.g
r"foo\(\.\)+"
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that Erlang becomes more Perl-like. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be also problematic with heredocs and fenced code blocks:
(I'm pasting an image because I don't know how to write it in Markdown so it renders correctly. :))
I'm personally a fan of code blocks done with 4 spaces indent but fenced blocks are sometimes useful too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wojtekmach Markdown allows any number of backticks, so if you need 3 backticks, just write it within 4 backticks. Perhaps Erlang could allow that trick too:
(Note that the above example was enclosed in 5 backticks.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the backtick idea, but I'll prefer it to be used like in Javascript, where it permits to do interpolation, for example:
or
Or (why not) single backtick as an alternative to triple quotes using the same idea, the last backtick defines the indentation
BTW, triple quotes give more control of the encoding (I don't see any problem writing the below)