-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Added debug IDs to source bundle javascript files #762
Conversation
|
20d4201
to
835a85c
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #762 +/- ##
==========================================
+ Coverage 74.22% 74.53% +0.31%
==========================================
Files 70 70
Lines 15117 15317 +200
==========================================
+ Hits 11220 11417 +197
- Misses 3897 3900 +3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its very unfortunate that we use the same file format for both artifact bundles (that need metadata), and source bundles (that only care about files).
I would separate these two things from one another. We could have an accessor for the metadata which lazily parses the JSON manifest.
This metadata then allows you to query things by path or by debug-id, etc.
For applying the native source context, we do not need the metadata at all.
For that native use-case, the file paths in the zip are completely predictable.
Profiling has even shown that unzipping and parsing the JSON manifest (which can be hundreds of KiB) is expensive, especially considering that you mostly want just a couple of files from it.
So there, it might be cheaper to just access the contents directly via the predictable file paths, and do the manifest parsing and lookup as a fallback.
I’m not entirely sure what the best idea is to deal with "contents" vs "url". We could do the same logic there as well.
One optimized code path for:
- Give me the contents of the file contained in the zip.
Plus another code path for:
- Give me the metadata (including the url where to fetch the file) for file with name or debug-id.
if possible_sourcemap.starts_with("data:application/json") { | ||
return None; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we actually want to return the data
url here? Otherwise we would have to do that same detection logic in symbolicator again. CC @loewenheim wdyt, as you implemented the base64 encoded data urls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should expose the data url here, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure though if we should leave the data url prefix alone; if the URL is returned as a plain string, we will need to be able to recognize it later.
Co-authored-by: Sebastian Zivota <loewenheim@users.noreply.github.com>
Co-authored-by: Arpad Borsos <swatinem@swatinem.de>
Co-authored-by: Arpad Borsos <swatinem@swatinem.de>
6cd5ee0
to
fba2b69
Compare
This adds a new header
debug-id
to files stored in a source bundle that is used as artifact bundle. The goal of this is to enable the lookup of JavaScript files by debug ID. As we are not using much of theDebugSession
abstraction for this yet, i chose not to add this to the trait but just the particular implementation of a session for source bundles. Lookup requires knowledge of the file type and only returns contents, no meta information.Manifest Format Changes
The manifest gains a new well known key in the
headers
dictionary calleddebug-id
which so far is exclusively used to refer to debug IDs of source maps and minified source files. However it's intentionally kept generic in case there is further use of this in the future. For instance it might be valuable for discovering debug files for.wasm
files on the internet via HTTP header.Additionally the use of
debug_id
in the toplevel manifest is now also used to refer to the artifact bundle for JavaScript. This means that in the JavaScript case we have multiple IDs at play:debug_id
: refers to the artifact bundle (source bundle) of a particular upload / releasedebug_id
: refers to the particular debug ID of an individual minified source and it's corresponding source mapThis also changes the implementation to always lowercase headers before writing and after reading for simplified handling.
Source Maps vs Indexed RAM Bundles
Because we have different
SourceFileType
s for normal source maps and indexed RAM bundles they require separate handling for the caller. This is probably sensible as they are nothing alike but it's a bit odd.Artifact vs Source Bundle
This commit also documents that a source bundle and an artifact bundle are really the same thing, with some minor differences about how we refer to it depending on if it's use to hold sources for a debug file, or if it holds release like artifacts for JavaScript processing.
Other API Changes
This also adds missing functionality to the
SourceBundleDebugSession
to make this actually useful for JavaScript processing:source_by_url
is added to look up a source by URL. Open question here is if this can just be the same assource_by_path
? It's odd that we have two ways to key this really.source_by_debug_id
is the canonical way now to look up sources by debug ID + kindSourceFileDescriptor
is added to better describe what a source is for. The descriptor has all the API necessary to retrieve information from source files (contents and meta data) that is relevant for symbolication and unminification:ty
returns theSourceFileType
. This was previously impossible to read after setting.contents
returns the contents if available, otherwiseurl
should be used for fetching.url
doubles as a way to either refer to a source link or the canonical URL of the source file. There is no reason why these need to be separate methods as they are referring to the same thing anyways.path
is just the path that was used to look up the source for native source bundle companionsdebug_id
returns the parsedDebugId
for source maps and minified sources in bundles that have debug IDssource_mapping_url
returns either thesourcemap
orx-sourcemap
reference in the manifest of the bundle. If neither can be found, it falls back to parsing the embedded contents. This makes it convenient enough to use though one could make an argument that the parsing logic should go elsewhere.