Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: Fix YAML maps key type #4138

Merged
merged 1 commit into from
Feb 9, 2018
Merged

Conversation

dmgawel
Copy link
Contributor

@dmgawel dmgawel commented Dec 1, 2017

To support boolean keys, the yaml package unmarshals maps to map[interface{}]interface{} (see: go-yaml/yaml#139). This behaviour is unexpected in Hugo case and breaks maps ranging and converting to json. This PR follows many yaml package dependent projects and changes all maps to map[string]interface{} like we would've gotten from json. Config exported by https://github.com/go-yaml/yaml would be perfect solution but it looks like it will not come quickly.

Fixes #2441, #4083

case map[interface{}]interface{}:
res := make(map[string]interface{})
for k, v := range in {
kStr, _ := k.(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes the key is actually a string, an assumption we cannot make without breakage. Also, this PR could need some more tests, esp. re the recursiveness of this + types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I've changed key handling and non-string values are now converted to strings. Additionally I added tests for nested maps and different key types.

Recurse through result of yaml package parsing and change all
maps from map[interface{}]interface{} to map[string]interface{}
making them jsonable and sortable.

Fixes gohugoio#2441, gohugoio#4083
case map[interface{}]interface{}:
res := make(map[string]interface{})
for k, v := range in {
res[fmt.Sprintf("%v", k)] = stringifyYAMLMapKeys(v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still assumes that this is a map with string keys. If I, for some reason, use integers, then I don't want Hugo to convert them into strings.

  • We can probably assume that all the keys are of the same type, so check one element before doing a conversion.
  • Also, use cast.ToStringE to convert into a string (much faster)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point but I'd like to point out that converting all keys to strings is a default behaviour for both TOML and JSON parsers in Hugo. What should we do about that? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bep. Have you thought this through? IMO keeping consistent parsing behaviour between different formats is a better approach, but I leave the decision to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"default behaviour for both TOML and JSON parsers in Hugo." I think you are right for the top level maps. But here you are doing a deep conversion, which does not make sense. Arrest me if I'm wrong because I have not looked at this code in a while.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review a simple test case I wrote and run on Hugo code without my changes:

package main

import (
	"github.com/gohugoio/hugo/parser"
	"fmt"
)

func main() {
	res, _ := parser.HandleJSONMetaData([]byte("{\"inted\": {\"1\": \"a\"}, \"booled\": { \"true\": \"a\"}}"))
	fmt.Printf("%#v\n", res)
	// map[string]interface {}{"inted":map[string]interface {}{"1":"a"}, "booled":map[string]interface {}{"true":"a"}}

	res, _ = parser.HandleTOMLMetaData([]byte("[inted]\n1 = \"a\"\n[booled]\ntrue = \"a\""))
	fmt.Printf("%#v\n", res)
	// map[string]interface {}{"inted":map[string]interface {}{"1":"a"}, "booled":map[string]interface {}{"true":"a"}}

	res, _ = parser.HandleYAMLMetaData([]byte("inted:\n  1: a\nbooled:\n  true: a"))
	fmt.Printf("%#v\n", res)
	// map[string]interface {}{"inted":map[interface {}]interface {}{1:"a"}, "booled":map[interface {}]interface {}{true:"a"}}
}

It looks that indeed Hugo unmarshals TOML and JSON to map[string] not only for top-level maps. But maybe I did something wrong, I'm not a Go expert :-)

Deep conversion would made no sense for article front matter but it's intended for complex data stored in data directory. YAML seems easier and more readable to structure nested maps and arrays than TOML but because of this inconsistency one cannot sort or range over those maps.

@vassudanagunta
Copy link
Contributor

This behaviour is unexpected in Hugo case and breaks maps ranging and converting to json.

Can you provide a test case that shows this top-level problem?

bep pushed a commit that referenced this pull request Feb 2, 2018
*  Adds retro-coverage for #4361
*  Verifies open issues #4138, #3890, #4366, 4083
*  Removes test reliance on the very code it is testing (hugo/parser package).
   Expected results are now all built manually / are more precise.
   Tests can run against different versions (no linkage errs)
@bep bep added this to the v0.37 milestone Feb 9, 2018
@bep bep merged commit 16a5c74 into gohugoio:master Feb 9, 2018
@vassudanagunta
Copy link
Contributor

vassudanagunta commented Feb 9, 2018

@bep This merge broke data directory tests that were added since this PR's commit (5c89674, was made 2 months ago). Those tests were added (PR #4373) to increased coverage of the data directory as well as to illuminate de facto behavior, whether or not it desirable, so that it could be discussed and acted upon. That discussion (#4379) is still waiting. I pointed out the different treatment of YAML, the possible justification for it, and asked Do we need to support non-string map keys in YAML data? Likewise above you both discussed this:

@bep:

"default behaviour for both TOML and JSON parsers in Hugo." I think you are right for the top level maps. But here you are doing a deep conversion, which does not make sense. Arrest me if I'm wrong because I have not looked at this code in a while.

@dmgawel:

Deep conversion would made no sense for article front matter but it's intended for complex data stored in data directory. YAML seems easier and more readable to structure nested maps and arrays than TOML but because of this inconsistency one cannot sort or range over those maps.

Is the merge of this PR an implicit answer of "No, we don't need to support non-string map keys in YAML data"? Or was this change only meant for front matter? The tests I added were intended to guard against unintentional changes to data directory processing.

If the former I will update the tests to reflect the new specification. If the latter I can add code so that YAML in the data directory is treated the old way, different from YAML front matter. It will be easy because I've already done that for JSON.

It would be nice to have the #4379 discussion.

/cc @moorereason

@github-actions
Copy link

github-actions bot commented Feb 7, 2022

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert YAML to JSON with menu fails
3 participants