Skip to content

Commit

Permalink
feat(wasm): WASM support (#301)
Browse files Browse the repository at this point in the history
This implements WASM support. It uses a slightly modified protocol where a sender can
supply an `addr_mode` to be explicit about the addressing mode. `"abs"` is the default
absolute addressing mode, another option now is `rel:index` with the index of a module
in which case the instruction address is considered to be module relative. This change
also now lets existing users benefit of this by providing the relative address within any
other debug module as alternative.
  • Loading branch information
mitsuhiko authored Nov 26, 2020
1 parent 025441c commit 41b838f
Show file tree
Hide file tree
Showing 14 changed files with 625 additions and 163 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- Add `processing_pool_size` configuration option that allows to set the size of the processing pool. ([#273](https://github.com/getsentry/symbolicator/pull/273))
- Use a dedicated `tmp` sub-directory in the cache directory to write temporary files into. ([#265](https://github.com/getsentry/symbolicator/pull/265))
- Use STATSD_SERVER environment variable to set metrics.statsd configuration option ([#182](https://github.com/getsentry/symbolicator/pull/182))
- Added WASM support. ([#301](https://github.com/getsentry/symbolicator/pull/301))

### Bug Fixes

Expand Down
72 changes: 62 additions & 10 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 8 additions & 1 deletion docs/advanced/symbol-server-compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ Specifically, the code and debug identifiers are defined as follows:
bidirectionally from the UUID.
- **Debug ID:** The same UUID, amended by a `0` for age.

**WASM**:

- **Code ID:** The bytes as specified in the `build_id` custom section.
- **Debug ID:** The same as code ID but truncated to 16 bytes + `0` for age.

**PE** / **PDB**:

- **Code ID:** The hex value of the `time_date_stamp` in the COFF header
Expand Down Expand Up @@ -266,6 +271,7 @@ The build-id hex representation is always provided in **lowercase**.

- **ELF** (binary, potentially stripped)
- **ELF** (debug info)
- **WASM** (debug info)

Symbol bundles are supported by adding a `.src.zip` prefix to the ELF:

Expand Down Expand Up @@ -297,7 +303,7 @@ The following layout types support this lookup:
If you have no requirements to be compatible with another system you can also
use the "unified" directory layout structure. This has the advantage that it's
unified across all platforms and thus easier to manage. It can store breakpad
files, PDBs, PEs and everything else. The `symsorter` tool in the symbolicator
files, PDBs, PEs and everything else. The `symsorter` tool in the symbolicator
repository can automatically sort debug symbols into this format and also
automatically create source bundles.

Expand All @@ -309,6 +315,7 @@ The debug id is in all cases lowercase in hex format and computed as follows:
- **PDB**: `<Signature><Age>` (age in hex, not padded)
- **ELF**: `<code_note_byte_sequence>`
- **MachO**: `<uuid_bytes>`
- **WASM**: `<BuildId>`

The path format is then as follows:

Expand Down
11 changes: 10 additions & 1 deletion docs/api/response.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ symbolication succeeds within a configured timeframe (around 20 seconds):

// Frame information
"instruction_addr": "0xfeedbeef", // actual address of the frame
"addr_mode": "abs", // address mode
"sym_addr": "0xfeed0000", // start address of the function
"line_addr": "0xfeedbe00", // start address of the line
"package": "/path/to/module.so", // path to the module's code file
"symbol": "__1cGmemset6FpviI_0_", // original mangled function name
"function": "memset", // demangled short version of symbol
Expand Down Expand Up @@ -73,6 +73,15 @@ occurred during symbolication, such as missing symbol files or unresolvable
addresses within symbols are reported as values for `status` in both modules and
frames.

## Note on Addresses

Addresses (`instruction_addr` and `sym_addr`) can come in two versions. They
can be absolute or relative. Symbolicator will always try to make addresses
absolute but in some cases this cannot be done. For instance WASM modules do
not have absolute addresses in which case the addresses stay relative. This is
identified by the `addr_mode` property. When it's set to `"abs"` it means
the addresses are absolute, when `"rel:X"` it's relative to module index `X`.

## Backoff Response

If symbolication takes longer than the threshold `timeout`, the server instead
Expand Down
24 changes: 13 additions & 11 deletions docs/api/symbolication.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ Content-Type: application/json
{
"frames": [
{
"instruction_addr": "0xfeedbeef"
"instruction_addr": "0xfeedbeef",
"addr_mode": "rel:0"
},
...
],
Expand Down Expand Up @@ -61,17 +62,18 @@ as well as external sources to pull symbols from:
- `sources`: A list of descriptors for internal or external symbol sources. See
[Sources](index.md).
- `modules`: A list of code modules (aka debug images) that were loaded into the
process. All attributes other than `type` are required. The Symbolicator may
optimize lookups based on the `type` if present. Valid types are `macho`,
`pe`, `elf`. Invalid types are silently ignored. The Symbolicator still works
if the type is invalid, but less efficiently. However, a schematically valid
but _wrong_ type is fatal for finding symbols.
process. All attributes other than `type`, `image_addr` and `image_size` are
required. The Symbolicator may optimize lookups based on the `type` if present.
Valid types are `macho`, `pe`, `elf`. Invalid types are silently ignored. The
Symbolicator still works if the type is invalid, but less efficiently. However,
a schematically valid but _wrong_ type is fatal for finding symbols.
- `threads`: A list of process threads to symbolicate.
- `registers`: Optional register values aiding symbolication heuristics. For
example, register values may be used to perform correction heuristics on the
instruction address of the top frame.
- `frames`: A list of frames with addresses. Arbitrary additional properties
may be passed with frames, but are discarded.
- `registers`: Optional register values aiding symbolication heuristics. For
example, register values may be used to perform correction heuristics on the
instruction address of the top frame.
- `frames`: A list of frames with addresses. Arbitrary additional properties
may be passed with frames, but are discarded. The `addr_mode` property
defines the beahvior of `instruction_addr`.

## Response

Expand Down
Loading

0 comments on commit 41b838f

Please sign in to comment.