Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One multiline string to rule them all. #225

Closed
wants to merge 1 commit into from

Conversation

BurntSushi
Copy link
Member

Adds multiline strings to TOML with a way to escape whitespace for
writing long single line strings.

I like this approach because it combines multiline strings and long single line strings into one form. All of these ideas were mentioned in one form or another in #92, #163 and #164.

I chose """ because that seems to be what people converged on. The obvious down-side is that """ cannot be typed into a multiline string. This isn't a very high price to pay, but if we didn't want to, then I'd suggest Lua's multiline string syntax:

-- Regular multiline string.
s = [[hello world]]

-- Want to write `[[something]]`?
s = [=[ hello [[something]] world ]=]
s = [=====[ this works too ]=====]

Basically, the quoting mechanism builds in the ability to change the delimiters.

Adds multiline strings to TOML with a way to escape whitespace for
writing long single line strings.
@amattn
Copy link

amattn commented Jun 26, 2014

Using """ means that you can no longer put python code in there. But """ is so much cleaner than lua's method.

Can we do some hybrid like """, """", """""... up to some max number of "'s?

That would allow flexibility but maintain readability/familiarity.

@BurntSushi
Copy link
Member Author

I love Lua's method. Its elegant, simple and robust. If we're going to with smart delimiters, then I would really love to just use Lua's approach.

If you had to write Python, you could always use ''' instead, since Python supports both.

With "{n}.*"{n}, how would you write a string that began and ended with quotes?

@pygy
Copy link
Contributor

pygy commented Jun 26, 2014

This proposition does not cover #163, which was about splitting long strings without line breaks into smaller ones to be concatenated.

@BurntSushi
Copy link
Member Author

@pygy Yes it does. See the second example.

@parkr
Copy link

parkr commented Jun 26, 2014

I feel really weird about multiline strings. It just doesn't sit right with my idea of what TOML ought to be. Why do we require >75 characters in a value? Is TOML meant to hold content? I'd like to hear more about the motivations for this, and get @mojombo's take on this proposal. Overall, I'm 👎 on multiline strings, because I don't think TOML is for content. 😕

@BurntSushi
Copy link
Member Author

I wouldn't be heartbroken if this didn't make it in, but I understood it to be a popularly requested feature.

@pygy
Copy link
Contributor

pygy commented Jun 26, 2014

You mean this one?

s = [=[ hello [[something]] world ]=]

That's not the use case I have in mind. I mean the ability to split a long string without line ends like

s = "qwertyuiopoiuytewqwertyuiopoiuytewqwertyuioiuytrertyuioiyrewertyuiweruoureruiodfjkjhfdfghjkkhfdfhjkkhfdfgjkgfdfghjkkjhgfsyrertyuihffghjkkhdfgh"

into

s = "qwertyuiopoiuytewqwertyuiopoiuytewqwertyuioiuytrertyuioiyr" .. -- or `+`, or even 
    "ewertyuiweruoureruiodfjkjhfdfghjkkhfdfhjkkhfdfgjkgfdfghjkk" .. -- nothing at all like C
    "jhgfsyrertyuihffghjkkhdfgh"

Lua long strings preserve the line ends as is.

@BurntSushi
Copy link
Member Author

@pygy

 +For writing long strings without introducing extraneous whitespace, use `\\n`.
 +All whitespace proceding `\\n` up to the first non-whitespace or `\n` character 
 +is removed:
 +
 +```toml
 +# The following strings are byte-for-byte equivalent:
 +key1 = "The quick brown fox jumps over the lazy dog."
 +key2 = """
 +The quick brown \
 +  fox jumps over \
 +    the lazy dog."""
 +```

@pygy
Copy link
Contributor

pygy commented Jun 26, 2014

Sorry, it's time for me to go to sleep

I hadn't seen the diff, and though that you were suggesting Lua-style strings (which I like, personally), rather than the current proposal.

@BurntSushi
Copy link
Member Author

@pygy I like Lua-style strings too. I used """ because I perceived that to be the popular choice in looking over past issues. I'm trying to work with a consensus over here. :-)

@pygy
Copy link
Contributor

pygy commented Jun 26, 2014

I agree with you about the consensus, Lua is still a bit obscure, and using its syntax would likely cause a riot :-)

@jprichardson
Copy link

I like """ or '''

But why do you have to include \\n?

i.e. why can't you do this:

key2 = """
The quick brown 
  fox jumps over
    the lazy dog."""

?

@redhotvengeance
Copy link
Contributor

@jprichardson I believe the intention is to have the ability to both preserve and eliminate whitespace in multiline strings. Sans \\n the whitespace is preserved, and with \\n the whitespace is trimmed.

@BurntSushi
Copy link
Member Author

@jprichardson The string you gave:

key2 = """
The quick brown 
  fox jumps over
    the lazy dog."""

Would be valid and not equivalent to:

key2 = """
The quick brown \
  fox jumps over \
    the lazy dog."""

The latter string is byte-for-byte equivalent to "The quick brown fox jumps over the lazy dog.". The first string has line breaks in it.

The idea is that \ before a \n escapes the new line and trims all proceding whitespace up to the next \n or non-whitespace character.

@ChristianSi
Copy link
Contributor

Python allows both """...""" strings (which can contain ''') and '''...''' strings (with can contain """). This might be a good solution for the "what about """ in long strings?" issue.

Python also allows backslash-escaping triple quotes in triple-quoted strings to handle the unlikely case that both ''' and """ occur in the same string.

@BurntSushi
Copy link
Member Author

@ChristianSi I think we can just fallback on escaping quotes. Regular TOML strings already support it.

@ChristianSi
Copy link
Contributor

@BurntSushi That's true, I agree.

@BurntSushi
Copy link
Member Author

@mojombo What do you think?

@johanfange
Copy link

I propose using ''' instead, and as a raw string. No escaping, no \n, no C-style \-continues-the-line. Plain and simple.

Furthermore, I would support trimming the first and last newline in a multi-line ''' string. These would then be equivalent:

somevar = '''
line1
line2
'''

somevar2 = '''line1
line2
'''

somevar3 = '''   
line1
line2'''

somevar4 = '''line1
line2'''

somevar4 = "line1\nline2"

@BurntSushi
Copy link
Member Author

@johanfange No. We need two versions. One of them is a regular string, """ and one is raw '''. Regular strings support escape sequences (\t, \n, \uXXXX, \UXXXXXXXX) and the ability to write long strings without line breaks. These things are orthogonal to raw strings.

I'd rather not trim the last new line. Why do you want to? The common case is to end your string with a new line, which looks weird and counter-intuitive if we trim the last one.

@mojombo
Copy link
Member

mojombo commented Jun 27, 2014

I think we can just fallback on escaping quotes. Regular TOML strings already support it.

Agreed.

@johanfange
Copy link

I'd rather not trim the last new line. Why do you want to? The common case is to end your string with a new line, which looks weird and counter-intuitive if we trim the last one.

Having read through your arguments in the old threads on this I now agree that last line trimming isn't desirable.

However, I'm unconvinced of the need for multi-line strings with escaping though. What is the suggested use-case for this? It's not very pretty and doesn't seem terribly useful at first glance.

"""
The quick brown \
  fox jumps over \
    \t\t\nthe lazy dog.
"""

@BurntSushi
Copy link
Member Author

What is the suggested use-case for this?

Writing long strings without line breaks (and obviously, without your editor auto-wrapping the line ewyuck).

@johanfange
Copy link

@BurntSushi Okay, just seemed a bit overkill to me, especially considering parsers have to support it too. I think that kind of thing would be more appropriate at the application-level, e.g. by writing in Markdown, HTML or something.

Then again, I usually don't write long one-line paragraphs in my config files, so who am I to judge.

@BurntSushi
Copy link
Member Author

Closed in favor of #228.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants