-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify comment preservation behavior in the TOML spec #836
Comments
I don't disagree that round-tripping comments can be very useful, but I don't really see what the advantage would be to add this to the specification? It's not part of the Go specification either for example: it's just an implementation detail of the Go toolchain. And for a lot of use case preserving comments isn't important at all. |
I don't think this needs to be added to the specification. FWIW, there is at least https://pypi.org/project/tomlkit/ already, that claims to be a style preserving TOML encoder/decoder (I've never actually used it, to know how good a job it does) -- basically, this is achievable without anything explicit about this in the specification, and as @arp242 has noted already, there's lots of use cases where preserving comments and round-tripping behaviours are not required. If there's a bunch of such parser libraries and their authors reckon that it would be useful to have a common specification for how style preservation should work with TOML, we can look into this then. Otherwise, we're just specifying things in thin air with no binding effects whatsoever and I'd prefer to defer this till then. |
Thanks for the feedback. I do agree that adding things to the spec out of thin air doesn't necessarily make sense, but there is also a bit of a chicken and egg problem - if the spec never says anything about it then implementers likely won't think about it either until someone asks for it later. I understand the reasons given against adding it to the spec, but also, even if just to document it, here are some arguments on the other side of it that might help choose a direction for this in the future:
Maybe the spec doesn't need something detailed and elaborate, but rather just to mention that implementations that choose to have such behavior can do so and some basic guidance on how that would be done. Another idea is that perhaps a reference implementation would help? I haven't looked at the C TOML implementations, but maybe something written in C and thus usable from many other languages would help work out the behavior in more detail. |
Closing this since I don't think the specification needs to specify "you can decide whether to preserve comments" or describe the implementation concerns of how parsers can represent the style-preserving structure. |
Wish I saw this 2 years ago (when this issue was created), as the lack of comment preservation makes comments in certain files completely useless depending entirely upon whether the application serializes its suggestions back to disk...which many do. Despite TOML thinking that all config files should support comments and is intended to be for humans, TOML has unfortunately become another JSON and is intended to be for just apps. Oh wells... |
You have this backwards. Comment preservation is only relevant in TOML serialisation contexts, which are by definition performed by an application. The language has tens, possibly hundreds, of thousands of users at this point, very few of which need this. If you have a decent point to make, then by all means make it, but being salty just for the sake of it isn't going to advance the language in any direction you want. |
I wonder if there is a way to gain more objective data on this. I run into this issue (config files that need to be equally editable by humans and code) very frequently, so I would be surprised if this is really that rare. But I don't have any actual data other than my own experience and intuition. (And 9 months later my position is still the same - I believe this is actually really important and I suspect more common than is obvious.) If I do think of a way to get more concrete data on this I will certainly post it here. |
Personally, I'm suspicious of applications that re-write their configuration files. Some just do it to store timestamps of last-run activity. There are better options for this sort of thing; separate text files, SQLite database files, and such. Why roll this sort of thing into your app configuration? |
I maintain a very popular TOML implementation for C++ and this is not a frequent request. The people that want it request it very loudly, though. It's the noisy minority case. Most users just want to read from a config. Far fewer need to write one out. |
@bradleypeabody, one example is podman's library (containers/common) which uses XDG_CONFIG_HOME to read its configuration but also exposes an interface that updates the file back on disk. But maybe that gets into semantics over what configuration actually means. There's also a project uses TOML to store reverse-engineering artifacts (essentially comments), but since the output isn't designed for humans (rather git) I guess it doesn't really matter there...since it's being used for data exchange. @eksortso, 100% agreed, but as you know, you can't convince devers to follow any rules other than abide by the capabilities of their abstractions...and for the capabilities of those abstractions, you only have specifications. @marzer, no point. i just find it funny that toml self-labels as a format for humans, thinks config files should support comments, and then demotes them to being ignored by implementors (similar to json). two decades ago during the dotnet wars ini-style configuration was being phased out for this new machine-and-human-readable format called xml. back then, xml suggested that comments had value, and many people began doing crazy things like annotations and self-documentation because of the word MAY being written in the first version of the spec. in the end, many lives were lost in those wars...although, not at all related to data exchange. still, here's to the next decade of missed opportunities trying to define what human-friendly software actually means, and to our next generation not repeating this cycle. |
@arizvisa, @bradleypeabody: If you think this issue is important, why not bring it up with the implementers of your favorite TOML parser/writer? There is certainly nothing in the spec that prevents them from implementing such behavior. |
@ChristianSi, My favorite already does. ;-) |
Thanks for the feedback. In reply to a few of these points:
If we're talking about regular system daemons and such, I agree. However there are definitely uses cases where it makes sense such as tools that operate before compile time (e.g. code/configuration generators), as well as tools that support UI configuration as well as via config file - in which case having both scenarios use the same TOML file as the source of truth tends to make a lot of sense.
Fair enough, good to know.
Yeah this is a fair point, and it's actually how this started for me. But then questions immediately came up about the rules for such comment preservation. For example, what about comments not associated with any data (at the very top or bottom of the file, or just in between other lines with no clear association to a non-comment line). Each tool essentially has to figure all of this out on its own and decide on the tradeoffs between how detailed does the comment preservation have to be in order to be workable. Which sort of wraps back around to my original point while writing this: If the spec never says anything about it, then implementations will typically not support it or at best will do it inconsistently. If I put together a list of some guidelines of what might make sense, would that at least be worth looking at? There is a pretty big gap between preserving comments with byte-for-byte reproduction vs a list of a few rules that would be much easier to implement and result in the desired "won't vaporize your comments" behavior. Which is why I think it's worth exploring and writing up instead of just leaving it open ended. |
An issue that comes up more frequently than I believe is realized is the ability (lack thereof) to preserve comments during edits from automated tools, examples: #284 https://github.com/JuliaLang/TOML.jl/issues/32 . Other configuration languages lack this (YAML, JSON, and to some extent XML), and adding a behavior for this to the specification would give TOML yet another leg up.
The Go programming language is a great example of where I believe they got this right. Its tools can read a source file, make edits, and spit it back out with comments preserved (except where automated modifications removed the corresponding section). I realize it's not a configuration file format, but the use case is pretty much the same - you start by hand editing it, and then later you find that since what you're doing is successful and fantastic you want to make an automated tool to handle various things, and you don't want your comments automatically evaporated when you do this.
Not every single possible comment scenario needs to be preserved exactly. E.g. if an automated tool converts
a = 1 # just for now
to:I think such behaviors are perfectly acceptable and a workable tradeoff for being able to allow automated tools to safely modify TOML files without comment loss.
One of the challenges to different implementations adding this functionality is that there is no definition of how it should be done. I think a simple set of rules could be arrived at, which implementations could refer to and follow if they want to support this feature.
I'm sure a proposal could be worked out with an exact set of such rules if there is agreement/interest.
The text was updated successfully, but these errors were encountered: