Define how to extract the `sourceMappingURL` comment #94

nicolo-ribaudo · 2024-06-13T12:53:55Z

This was originally opened at tc39/source-map-spec#30

I am currently being hand-wavy about CSS, only saying "it should be similar to JS". I can propose the adjusted algorithm in a followup, but given that technically source maps are not language-specific we might also just say "other text languages should be like JS, adapted to their own comments syntax".

This patch explicitly defines how to extract such comments from JavaScript, CSS and WebAssembly sources.

It defines multiple ways to do so: either by actually parsing the code, or by just going through all the lines of the program looking for what "looks like" a comment. This is so that different implementations can choose what's best for them, depending on whether they are already parsing the code or not.

To ensure consist behavior accross implementations that choose different strategies, the specification enforces additional requirements on tools that append a sourceMappingURL comment to the generated code: the comment must be placed in such a way that all extraction methods yield the same result. This is not an unresonable burden, since if the progeram is syntactically valid, simply adding the comment at the end of the file only potentially followed by other tool-injected comments is enough. This requirement is lifted if the input code given to the tool is already "maliciously crafted", since we would otherwise require tool to go rewrite that code (for example, splitting strings that contain something that looks like a comment).

It has the following properties:

It iterates line by line. Implementations can thus optimize it by going through each line in reverse order, and then scanning through its characters from the beginning to the end (which is what a regexp would do).
It expects multi-line comments to actually be in a single line.
It returns the last sourceMappingURL comment (or well, comment-like) found in the source.
It only considers comments after the last piece of code (i.e. it discards any comment found so far every time it sees some non-comment non-whitespace characters).
It has no requirements about what is before a comment. Adding the comment at the end of the file without first ensuring that there is a newline before it is valid.

JavaScript, CSS and WebAssembly sources. It defines multiple ways to do so: either by actually parsing the code, or by just going through all the lines of the program looking for what "looks like" a comment. This is so that different implementations can choose what's best for them, depending on whether they are already parsing the code or not. To ensure consist behavior accross implementations that choose different strategies, the specification enforces additional requirements on tools that append a `sourceMappingURL` comment to the generated code: the comment must be placed in such a way that all extraction methods yield the same result. This is not an unresonable burden, since if the progeram is syntactically valid, simply adding the comment at the end of the file only potentially followed by other tool-injected comments is enough. This requirement is lifted if the input code given to the tool is already "maliciously crafted", since we would otherwise require tool to go rewrite that code (for example, splitting strings that contain something that looks like a comment). I have left the CSS extraction method as TODO because first I want to check how do you feel about the JS one. It has the following properties: - It iterates line by line. Implementations can thus optimize it by going through each line _in reverse order_, and then scanning through its characters from the beginning to the end (which is what a regexp would do). - It expects multi-line comments to actually be in a single line. - It returns the last `sourceMappingURL` comment (or well, comment-like) found in the source. - It only considers comments after the last piece of code (i.e. it discards any comment found so far every time it sees some non-comment non-whitespace characters). - It has no requirements about what is _before_ a comment. Adding the comment at the end of the file without first ensuring that there is a newline before it is valid.

nicolo-ribaudo · 2024-06-13T12:56:41Z

source-map.bs

+        1. [=Collect a sequence of code points=] that are [=white space code points|ECMAScript
+            white space code points=] from |line| given |position|.


In the original PR there was this comment by @gibson042:

Is it a problem that ECMAScript white space is subject to change over time as future Unicode editions change the set of code points in general category "Space_Separator"?

I think it's ok to expect implementations to evolve together with Unicode, but how do other folks feel?

jkup

This looks really good! I'm keen to get the changes in so we can continue hardening this part of the spec. I think we're definitely ok with updating the spec if JavaScript adds more spaces to their spec.

source-map.bs

SHA: 0067d9f Reason: push, by jkup Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

gibson042 · 2024-06-13T13:09:49Z

source-map.bs

+### Linking through HTTP headers
+
+If a file is served through HTTP(S) with a `sourcemap` header, the value of the header is
+the URL of the linked source map.
+
+```
+sourcemap: <url>
+```
+
+Note: Previous revisions of this document recommended a header name of `x-sourcemap`.  This
+is now deprecated; `sourcemap` is now expected.


Some [belated] observations:

The most precise vocabulary is "[HTTP] header field" per RFC 9110; should that be adopted in this document or should it stick with the colloquial "header"?

sourcemap really should be registered in Message Headers, but is currently not.

Is <url> valid and meaningful? For a more precise and analogous definition, see RFC 8288.

nicolo-ribaudo commented Jun 13, 2024

View reviewed changes

jkup mentioned this pull request Jun 24, 2024

Which versions are required to be supported by implementations? #10

Open

jkup self-requested a review June 25, 2024 14:22

jkup approved these changes Jun 25, 2024

View reviewed changes

nicolo-ribaudo mentioned this pull request Jun 25, 2024

My dream spec text #105

Open

takikawa reviewed Jun 25, 2024

View reviewed changes

source-map.bs Show resolved Hide resolved

nicolo-ribaudo added 4 commits June 25, 2024 16:43

Add fallback null

dadf256

Also return null for wasm

faaf960

Disallow multiple custom sections for sourceMappingURL

f14a7f8

Improve wording

bffb47d

jkup self-requested a review June 25, 2024 14:53

jkup approved these changes Jun 25, 2024

View reviewed changes

jkup merged commit 0067d9f into tc39:main Jun 25, 2024
1 of 2 checks passed

github-actions bot added a commit that referenced this pull request Jun 25, 2024

Merge pull request #94 from nicolo-ribaudo/extract-source-mapping-url

a220c36

SHA: 0067d9f Reason: push, by jkup Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

nicolo-ribaudo deleted the extract-source-mapping-url branch June 25, 2024 14:57

nicolo-ribaudo mentioned this pull request Jun 26, 2024

//# sourceMappingURL=... doesn't have to be on the last line #64

Closed

gibson042 reviewed Jul 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define how to extract the `sourceMappingURL` comment #94

Define how to extract the `sourceMappingURL` comment #94

nicolo-ribaudo commented Jun 13, 2024

nicolo-ribaudo Jun 13, 2024

jkup left a comment

gibson042 Jun 13, 2024

		1. [=Collect a sequence of code points=] that are [=white space code points\|ECMAScript
		white space code points=] from \|line\| given \|position\|.

Define how to extract the sourceMappingURL comment #94

Define how to extract the sourceMappingURL comment #94

Conversation

nicolo-ribaudo commented Jun 13, 2024

nicolo-ribaudo Jun 13, 2024

Choose a reason for hiding this comment

jkup left a comment

Choose a reason for hiding this comment

gibson042 Jun 13, 2024

Choose a reason for hiding this comment

Define how to extract the `sourceMappingURL` comment #94

Define how to extract the `sourceMappingURL` comment #94