Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track line offsets for better accuracy of inline sourcepos #453

Merged
merged 2 commits into from
Aug 17, 2024

Conversation

digitalmoksha
Copy link
Collaborator

@digitalmoksha digitalmoksha commented Aug 13, 2024

Adds a line_offsets vector used to record the number of spaces stripped from each line of content added. This allows us to more accurately calculate sourcepos for inline elements.

This addresses #452

@digitalmoksha
Copy link
Collaborator Author

digitalmoksha commented Aug 13, 2024

Initial spaghetti against the wall. Try using a line offset vector.

It seems to solve the basic case. But small breakage in table header paragraph and the link_sourcepos_truffle and link_sourcepos_truffle_bergamot are not fixed.

Copy link
Contributor

Command Mean [ms] Min [ms] Max [ms] Relative
./bench.sh ./comrak-564b359 327.7 ± 1.8 324.4 331.9 2.96 ± 0.03
./bench.sh ./comrak-main 324.2 ± 1.5 321.8 327.2 2.93 ± 0.03
./bench.sh ./pulldown-cmark 110.6 ± 0.9 109.6 112.7 1.00
./bench.sh ./cmark-gfm 118.9 ± 0.9 117.6 121.2 1.08 ± 0.01
./bench.sh ./markdown-it 479.8 ± 4.4 473.2 493.9 4.34 ± 0.05

Run on Tue Aug 13 22:35:37 UTC 2024

@digitalmoksha
Copy link
Collaborator Author

digitalmoksha commented Aug 15, 2024

@kivikakk would you be able to take a look at this when you have a moment? I think this solves one of the main issues of inline sourcepos. Or at least gets us much closer.

I was able to solve the problems that we had with tables. But I haven't tackled any of the autolinking problems yet, which would be in another PR.

@digitalmoksha digitalmoksha changed the title Fix inline sourcepos Track line offsets for better accuracy of inline sourcepos Aug 15, 2024
Copy link
Owner

@kivikakk kivikakk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so good! Well done! You made this so much easier than I imagined. :)

src/cm.rs Outdated
&& isspace(literal[literal.len() - 2])))
&& !first_in_list_item
&& !self.options.render.prefer_fenced
if !(!info.is_empty()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !(!info.is_empty()
if !(info.len() > 0

Super minor, but I can't read multiple negatives (and "empty" itself is a bit negative ..).

src/nodes.rs Show resolved Hide resolved

let node_ast = node.data.borrow();
let adjusted_line = self.line - node_ast.sourcepos.start.line;
self.line_offset = *node_ast.line_offsets.get(adjusted_line).unwrap_or(&0);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this unwrap_or needed, or is it purely speculative? If it's the latter, I vote for this:

Suggested change
self.line_offset = *node_ast.line_offsets.get(adjusted_line).unwrap_or(&0);
self.line_offset = node_ast.line_offsets[adjusted_line];

Put another way: could this mask a bug one day? Or is it actually expected that the corresponding line offset may not be found for a valid reason? I've tried this change and the unit and spec tests pass, no crashes on fuzzing, so I think we should go with it.

See also: Contracts, Undefined Behavior, and Defensive Programming.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was being more defensive. But I think you're right, it's not needed. And if it's passing fuzzing, then I'm more confident.

I also changed it in table.rs

@digitalmoksha
Copy link
Collaborator Author

Thanks @kivikakk, incorporated the changes!

@digitalmoksha digitalmoksha merged commit 25f4b06 into main Aug 17, 2024
39 checks passed
@digitalmoksha digitalmoksha deleted the bw-inline-sourcepos branch August 17, 2024 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants