-
-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Index source map lookup #53
Conversation
The old code improperly found the section to call recurse into. Sourcemap sections behave very similarly to the mappings, and we need to find the section whose offset is `<=` the desired position. Additionally, the `col` is only subtracted if it is on the offset's line. This matches [Mozilla's behavior](https://github.com/mozilla/source-map/blob/0.6.1/lib/source-map-consumer.js#L1000-L1003). And as a final bonus, we can binary search it to speed up searches on larger maps.
Token lookup uses "greatest lower bound" lookups, which means we need to find the token on the line which is `<=` than our desired column. But, the old code accidentally considered the last token to span infinite columns and infinite lines, meaning a lookup of `(10000000, 0)` would return the last token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 on the fix for indexed source maps. We have a couple of places now that are essentially manual reimplementations of binary search. Is it feasible that we can just switch to std binary_search_by_key
, or is there some minute differences there?
I’m not too convinced about the restriction that "token must be on same line" though.
Is this documented somewhere properly? Or is it just an implementation detail of the mozilla sourcemap implementation?
I was under the impression that we are just looking for a token match lower to the source position we want, as you said.
I don't think it's possible to use Rust's native binary search for our case. Because we need the closest
It's an implementation detail, originally implemented in mozilla/source-map@699dff1, which points to https://groups.google.com/g/mozilla.dev.js-sourcemap/c/Uo26hB5v5oY as the reasoning. |
You can still use binary search, you just have to match on the result then, like here: Lines 116 to 119 in a261d00
As for the behavior, thanks for posting the links. I believe this might make sense when you "compress" source code down to a single line. // # a.js:
export function a() {
return "a.js";
}
// # b.js:
export function b() {
return "b.js";
} You then simply concat these two files and create a sourcemap with 2 mappings: // v first sourcemap token at "generated" line 1, column 1 points to "a.js" at line 1, column 1
export function a() {
return "a.js";
}
// v second sourcemap token at "generated" line 5, column 1 points to "b.js" at line 1, column 1
export function b() {
return "b.js"; // <- we are looking at "generated" line 6, column C
} In this example, when we look up line 6, there is not mapping on that exact line, but rather one line before. So we have two choices here:
In this specific example, we would change from returning an approximate location to none at all, which is not really a nice thing to do. I would not do this filtering in our library code and rather return the approximate token. let line = 6;
let token = sm.lookup_token(line, 1).filter(|token| token.get_dst_line() == line);
assert_eq!(token, None); Assuming the same example as above, we can just |
Oh! I clearly didn't read the docs closely enough, I didn't notice that the
I'll split this into a separate PR, but I'd really suggest that this be implemented as Mozilla's does. We're using this package to implement error stack tracing, and with the current behavior we return results that do not actually exist in user's source code. We insert synthetic runtime code into the output bundles, which is not mapped ( function userCode() {// code doesn't throw.
}
(function runtimeCode() {
// some error is thrown by invalid runtime behavior
if (Math.random() >= 0.5) {
throw new Error();
} // Runtime was intended to run user code
userCode();
});
//# sourceMappingURL=data:application/json;charset=utf-8;base64,eyJ2ZXJzaW9uIjozLCJuYW1lcyI6WyJ1c2VyQ29kZSIsInJ1bnRpbWVDb2RlIiwicmFuZG9tIl0sInNvdXJjZXMiOlsiaW5wdXQuanMiXSwic291cmNlc0NvbnRlbnQiOlsiZnVuY3Rpb24gdXNlckNvZGUoKSB7XG4gIC8vIGNvZGUgZG9lc24ndCB0aHJvdy5cbn1cblxuKGZ1bmN0aW9uIHJ1bnRpbWVDb2RlKCkge1xuICAvLyBzb21lIGVycm9yIGlzIHRocm93biBieSBpbnZhbGlkIHJ1bnRpbWUgYmVoYXZpb3JcbiAgaWYgKE1hdGgucmFuZG9tKCkgPj0gMC41KSB7XG4gICAgdGhyb3cgbmV3IEVycm9yKCk7XG4gIH1cblxuICAvLyBSdW50aW1lIHdhcyBpbnRlbmRlZCB0byBydW4gdXNlciBjb2RlXG4gIHVzZXJDb2RlKCk7XG59KSJdLCJtYXBwaW5ncyI6IkFBQUEsU0FBU0EsUUFBVCxHQUFvQixDQUNsQjtBQUNEOztBQUVELFVBQVVDLFdBQVY7RUFDRTtFQURGLFNBRVdDLE1BRlg7SUFBQTtFQUFBLEVBTUU7OztFQU5GO0FBQUEifQ== The code thrown on line 7 ends up finding the token for What about taking |
That is an interesting idea, but would be a breaking change. I believe just chaining I’m wondering from the example you gave, you claim the additional runtime code does not come from a file, but does that mean it still has a sourcemap token? shouldn’t the error find that token then? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m wondering from the example you gave, you claim the additional runtime code does not come from a file, but does that mean it still has a sourcemap token? shouldn’t the error find that token then?
Not for the the runtime code. This is how Babel handles it, if the associated code is injected during the transformer on a new line, the generator doesn't insert any sourceless tokens. But if the code is on the same line, then it'll insert 1 sourceless segment at the start.
Last weeks Sorry for the delay here. |
Done. |
There was a second clippy lint which I fixed in #54 so this is merged now, thank you @jridgewell 🎉 |
I found two bugs while using the package.
The last token in aDeferred till later PRRegular
map would be returned for very large lookup line numbers.Index
source map would recurse into sectioni
,i+1
, etc, when it should have recursed into onlyi-1
.The first deals with the "greatest lower bound" that the Mozilla Source Map package used. When searching for a token, we want to find the one closest to(line, col)
, given(gen_line, gen_col) <= (line, col)
. This means that looking up(0, 10_000)
should return the last token on line 0 (assuming the generated column is less than 10,000). Effectively, the last segment in a line extends to infinity columns. But, if we look for a very large input line, eg(10_000, 0)
, we accidentally returned the last token in our map (again, assuming there are less than 10,000 generated lines). That's because that token's position is<=
our lookup position. This is a break with Mozilla's behavior, where the returned token must be on our lookup line.Similarly, an
Index
map had another bug with the greatest lower bound lookup behavior. Each section's offset behaves the same as a token, where a section map applies for all positions after its offset. So we need to again find a(off_line, off_col) <= (line, col)
. The old code would find only(off_line, off_col) > (line, col)
(and then would keep searching beyond!).Additionally, the lookup column of an
Index
lookup is only subtracted if the lookup line is the same as the section's offset line. Effectively, the section's first line is shifted byoff_col
, but all following lines start at generated column 0.