Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A reserved syntax for user extension in strings #445

Closed
nugend opened this issue Jan 30, 2017 · 20 comments
Closed

A reserved syntax for user extension in strings #445

nugend opened this issue Jan 30, 2017 · 20 comments

Comments

@nugend
Copy link

nugend commented Jan 30, 2017

Something I have been doing with TOML is making a small extension to the syntax to allow for users to make substitutions inside of strings. This is something like templating, but a bit more simpler for implementers and users and explicitly less powerful.

For example:

title = "TOML Example"

[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00 # First class dates

[database]
server = "192.168.1.1"
ports = [ 8001, 8001, 8002 ]
connection_max = 5000
enabled = true
note = "$(ref title)"

In this case, the ref "function" is evaluated by the code loading the TOML file to be a reference to the title top level element and a substitution of the $(...) is made.

Other examples could be env to substitute an environment variable in the execution context, math for basic mathematical functions. The point is not the functions itself, but that users would have free reign to make extensions of their own.

Please note, I am not suggesting that the TOML spec formally say that anything should happen when the $(...) is found inside a string, simply that this is a reserved character sequence for user extension and will not be taken by another behavior or functionality at a later point in time. Further, if a user comes up with a syntax extension that involves additional parentheses, it's their own issue to deal with escaping.

From my understanding of the existing TOML spec, this shouldn't cause an issue and I do not believe that anything in the spec would indicate that something is planned. If this would cause a conflict with something, I don't think the specific characters are an issue and I'm sure that something else could reasonably be used instead.

If this is rejected, I understand, and plan to continue using this "extension" in my own code.

@StefanKarpinski
Copy link

Julia uses prefixed strings to indicate special meaning, e.g. ip"1.2.3.4" (prefixed strings can also optionally have trailing modifiers, as in r"^[a-z]+$"ism). This syntax could be used in TOML to indicate the type of a string to implementations that are interested in supporting types, while implementations that are not could simply ignore that additional information and parse the string as a string. However, I'm not sure it's worth complicating the standard with such a feature.

@nugend
Copy link
Author

nugend commented Mar 3, 2017

Based on my reading of the spec, I believe that prefixed strings would mean that current TOML parsers would be broadly incompatible (since the parsers would not recognize the leading character).

I feel that the advantage of my proposed change is that no TOML parsers would need to change since the $(...) syntax is handled wholly by the library or application specifying such an extension. A TOML parser could add support for handling strings containing $(...) by adding a callback, of course.

@a-teammate
Copy link

as this is alpha, invalidating TOML parser is currently not a goal of the spec :)

the ${ } syntax would require that you escape those chars if you do not want to use refs. and ${..} is really commonly used for string refs, see e.g. javascript template literals and much more.

@a-teammate a-teammate mentioned this issue Oct 14, 2017
26 tasks
@pradyunsg
Copy link
Member

I don't like this idea.

You can already use strings to represent such values and parse/process them at your application level. You're anyway going to have to do the exact same thing if you use the proposed syntax extension.

Sure, it alleviates some complexity at application level if you can have some extra information in the language but I'm feel that this is too niche to be a part of TOML.

@nugend
Copy link
Author

nugend commented Nov 17, 2017

I think you are misunderstanding the proposal: It is not saying the TOML parser will do something when that string is encountered, it is a guarantee that those tokens will not be used in the future for some other purpose.

Just that the user is guaranteed to not have to worry about forwards compatibility.

@pradyunsg
Copy link
Member

I think you are misunderstanding the proposal

This just a case of the consumer handling certain strings (here strings starting with $(...)) specially and performing extra actions (like substitution/evaluation as specified in the OP). Am I understanding that correctly?

(by consumer I mean an end-user tool like cargo which uses the data in the config file.)

If so, allow me to try again, to say why I don't like this:

  1. It can be done today, on top of the current spec.
  2. TOML shouldn't care what the content of a string literal are -- as long as there's no control characters and escapes are correct etc. No special casing or "blessing" some format within a string.

Just that the user is guaranteed to not have to worry about forwards compatibility.

As I say above, TOML should never put any restriction on or "bless" character sequences within a string.

No, your language extension would not break because TOML stopped allowing $ in strings or something. That is basically guaranteed.

@ChristianSi
Copy link
Contributor

I agree with @pradyunsg. This request is pointless because it's utterly unnecessary.

TOML does not care what's inside a string. Not now, nor ever in the future.

@nugend
Copy link
Author

nugend commented Nov 20, 2017

TOML does not care what's inside a string. Not now, nor ever in the future.

I do not see that explicitly stated in the spec, so it's not in the spec.

@pradyunsg
Copy link
Member

To draw a parallel, JSON's spec doesn't say that JSON doesn't care about the contents of a string. We all know it doesn't. It's a strong guarantee even though not explicitly stated.

With that, I'll recede to merely spectating on this issue.

@nugend
Copy link
Author

nugend commented Nov 21, 2017

JSON is a data serialization format. TOML is a language. There's a different standard of expectation in terms of reservation of behavior.

Furthermore, TOML already supports a visual alignment control character. That wasn't in TOML until version 0.3.

@TheElectronWill
Copy link
Contributor

What is the point of saying "hey parsers, you can do something special with this character in strings" while it's already possible? It seems that it would complicate the spec for nothing.
Either you define precise rules about the substitution, or you let the parsers decide.

@pradyunsg
Copy link
Member

pradyunsg commented Nov 21, 2017 via email

@nugend
Copy link
Author

nugend commented Nov 21, 2017

Look, whatever. If TOML never gets a new string feature, then my argument is moot. But if a change to how strings are interpreted ever occurs, I would urge the maintainers to consider adding a user-reserved syntax for extension.

@mojombo
Copy link
Member

mojombo commented Nov 23, 2017

Thanks for your thoughts on this, @nugend. You're right that I can't say for sure that no feature will ever be added to strings that would affect your proposed extension syntax, but I can realistically say that the probability of it happening is extremely close to zero. If you're not doing anything crazy with newlines or backslashes or similar characters, a string is a string, and always should be. It would be quite non-obvious to make a change to strings that would break how "$(ref title)" is parsed. As such, I don't think the spec needs to be complicated by the matter, and I hope this puts your mind at ease, as much as a non-spec-changing comment can.

@mojombo mojombo closed this as completed Nov 23, 2017
@nugend
Copy link
Author

nugend commented Nov 24, 2017

That's fine. As I said in the original comment, I understand it may be rejected. If you do see reason to add some change to strings which would break this or any other likely substitution sequences though (I really have no idea what someone else might be using for a similar purpose, of course), please reconsider.

I mean, even if I get away with it, someone else might get bitten badly.

@PetrPy
Copy link

PetrPy commented May 10, 2019

If TOML is to be a preferred config language, some kind of key value reuse/expression evaluation/string interpolation is necessary. Nobody wants to write the same file paths all over again. At least the following should be possible:

root_dir = "/root/foo"
output_dir = {root_dir} + "/output"`
input_dir = {root_dir} + "/input"

port = 80
other_port = {port}

As the need is obvious, not implementing any interpolation will lead to many different application specific formats, rendering TOML less readable.

I know that implementing such a thing can make TOML much more complicated, however, I believe that a reasonable minimum can be found. For example:

  1. Only values of the same data type can be combined, such as key1 = "abc" + {key_for_a_string} or key2 = {key_for_an_integer} + 1.

  2. Only basic operators are allowed, such as + for string concatenation, +, -, *, // for integers, +, -, *, / for floats and not for booleans.

Even implementing string concatenation only would cover 90% of cases, i.e., defining directories and subdirectories.

@eksortso
Copy link
Contributor

The need may be obvious (to me, maybe), but it's not yet minimal, and so it doesn't belong in TOML v1.0.0, which is what the community here is focused on recently. Even a basic "put this string in that string" feature adds a layer of complexity to how strings are handled, because we'd have to decide the order that keys are evaluated, how escape sequences work after interpolation, proper ways to handle circular references, etc.

I'm looking forward to the post-v1.0 era, when we can take up new features like this with an eye on how these sorts of things are handled in the wild, which will make newer proposals much more obvious and minimal.

But let's get ourselves to v1.0 first before we address this subject again.

@PetrPy
Copy link

PetrPy commented May 10, 2019

Fair enough, I'll wait for 2.0 and stick with the good old ini files for now. Thanks for the discussion!

@ChristianSi
Copy link
Contributor

@PetrPy Why ini files? If you use string interpolation in ini files, you'll have do roll your own solution. Surely you can do the same in TOML as well—and still get all the other advantages which TOML has to offer, such as a more flexible data model, data types, and a precise specification.

@PetrPy
Copy link

PetrPy commented May 13, 2019

A pragmatic choice. ConfigParser parsing INI files with interpolation is a part of Python standard library, which makes it a semi-standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants