Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range info is off for \r\n line endings #127

Closed
mterwoord opened this issue Oct 4, 2019 · 4 comments
Closed

Range info is off for \r\n line endings #127

mterwoord opened this issue Oct 4, 2019 · 4 comments

Comments

@mterwoord
Copy link

I'm parsing YAML files, made on Windows with \r\n line endings. Parsing works fine (using YAML.parseDocument(string), except that range positions are off. It seems that somehow \r\n is only counted as a single character.

The following tests show the problem:

import YAML from '../../src/index'

test('basic', () => {
  // \n does work
  const src = `- 1\n- 2\n`
  const doc = YAML.parseDocument(src)
  expect(doc.errors).toHaveLength(0)
  const { items } = doc.contents
  expect(items[1].range).toStrictEqual([6, 7]);
})


test('basic', () => {
  // \r\n doesn't work
  const src = `- 1\r\n- 2\r\n`
  const doc = YAML.parseDocument(src)
  expect(doc.errors).toHaveLength(0)
  const { items } = doc.contents
  expect(items[1].range).toStrictEqual([7, 8]);
})
@eemeli
Copy link
Owner

eemeli commented Oct 4, 2019

Related: #20, #31

What's happening here is that yaml is normalising CR-LF line ends into LF before parsing, as it must do at least for scalar values. This happens before tokenisation, which is when the range indices are calculated; hence the offset you're seeing.

Now, depending on your use case, there are a couple of ways to get the ranges to match what you're expecting:

For the simplest thing, if you set the { keepCstNodes: true } option, the normalised source is available as doc.cstNode.context.src; the reported ranges will match to it as you'd expect. This option is false by default in order to allow references to the CST to go out of scope after parsing is complete and thereby let the internal objects be garbage-collected.

If that's not enough, the CST output has a method setOrigRanges() for calculating the origStart & origEnd references to the original source. This is how you can make use of it at the AST level:

const src = `- 1\r\n- 2\r\n`
const cst = YAML.parseCST(src)
cst.setOrigRanges() // return indicates if orig* have been set
const doc = new YAML.Document({ keepCstNodes: true }).parse(cst[0])
const { items } = doc.contents
expect(items[1].cstNode.range).toMatchObject({
  start: 6, end: 7, origStart: 7, origEnd: 8
})

So that works, but it's admittedly rather clumsy and a bit obscure. The reason for that is that you're the first one to ask after this feature, and I've not bothered to do anything about it pre-emptively. If you think the ergonomics of the API ought to be improved, could you elaborate a bit on how you're planning on using this functionality?

@mterwoord
Copy link
Author

Thanks for replying.

What I'm doing, is use this library as parsing library for https://github.com/terwoord/azure-pipelines-overview. It's used for parsing the yaml and providing a way to navigate to nodes.

I worked around it by counting the amount of \n characters before the range position, and add that to the number. I know, it's not 100% fool proof, but seems to work currently.

@eemeli
Copy link
Owner

eemeli commented Oct 28, 2019

Closing this as won't-fix, given that workarounds exist. Happy to reconsider later if/when this becomes an issue again.

@eemeli eemeli closed this as completed Oct 28, 2019
@pflannery
Copy link

I also ran in to this issue https://github.com/vscode-contrib/vscode-versionlens/issues/193
I managed to fix it by adding the line number to the range

function getRangeFromCstNode(cstNode, opts) {
  const crLfLineOffset = opts.hasCrLf ?
    cstNode.rangeAsLinePos.start.line : 0;

  const start = cstNode.range.start + crLfLineOffset;
  const end = cstNode.range.end + crLfLineOffset;

  return { start, end }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants