Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to encode multi-line strings (and other yaml string bugs) #37

Open
kageurufu opened this issue Mar 19, 2024 · 0 comments
Open

Fails to encode multi-line strings (and other yaml string bugs) #37

kageurufu opened this issue Mar 19, 2024 · 0 comments

Comments

@kageurufu
Copy link

kageurufu commented Mar 19, 2024

I need to encode multi-line strings within a record, that often contain quotes. As well, I have octal permissions such as 0644 that yaml would parse as an octal numeric literal. I would expect Encode.string to either properly escape and quote this, or generate a yaml multi-line string block.

escapeme: "0700"
oneline: "multi line string\nwith \"quoted\" text"
multiline: |-
  multi line string
  with "quoted" text

For now I'm working around this with the following:

safeEncodeString : String -> YE.Encoder
safeEncodeString =
    encodeNewlines
        >> encodeQuotes
        >> String.Extra.quote
        >> YE.string


encodeNewlines : String -> String
encodeNewlines =
    String.replace "\n" "\\n"


encodeQuotes : String -> String
encodeQuotes s =
    if String.contains "\"" s then
        String.replace "\"" "\\\"" s

    else
        s

Edit:

While reading https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell for other potential pitfalls, I've noted:

  • record keys may be parsed in odd ways, and the easy approach may just be to quote them all
  • strings starting with an asterisk (references), exclamation point (tags), and ampersand (anchors) must be quoted

I'm of the opinion that yaml encoding here should be as aggressively safe as possible, and parsing and encoded document should be identical regardless of the yaml specification used when parsing. String escaping and quoting should probably be the default, and multi-line strings might just be too problematic to deal with (at the minimum, needing to handle trailing newlines properly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant