Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Webcil-in-WebAssembly #85932

Merged
merged 31 commits into from
May 16, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
310aa3e
Add option to emit webcil inside a wasm module wrapper
lambdageek May 4, 2023
2f2459d
[mono][loader] implement a webcil-in-wasm reader
lambdageek May 8, 2023
7be51f8
reword WebcilWasmWrapper summary comment
lambdageek May 8, 2023
2e58062
fix whitespace
lambdageek May 8, 2023
056b9ac
fix typos
lambdageek May 8, 2023
897df89
visit_section should bump the ptr after traversal
lambdageek May 8, 2023
15a1894
remove extra bytes from wasm webcil prefix
lambdageek May 9, 2023
1688dec
don't forget to include number of segments in the data section
lambdageek May 9, 2023
d8638f7
update the Webcil spec to include the WebAssembly wrapper module
lambdageek May 9, 2023
e8f52bb
fix typos and whitespace
lambdageek May 9, 2023
f06938f
advance endp past the data segment payload
lambdageek May 9, 2023
0f5e02f
Adjust RVA map offsets to account for wasm prefix
lambdageek May 9, 2023
4271d8f
Add a note about the rva mapping to the spec
lambdageek May 9, 2023
2c90fb5
Serve webcil-in-wasm as .wasm
lambdageek May 9, 2023
307ae9a
fix wbt
lambdageek May 11, 2023
930cfeb
remove old .webcil support from Sdk Pack Tasks
lambdageek May 11, 2023
7670b3c
Set SelfContained=true for browser-wasm runtimes (#86102)
lewing May 11, 2023
d17b8dd
Implement support for webcil in wasm in the managed WebcilReader
lambdageek May 11, 2023
61afae7
why fail?
lambdageek May 12, 2023
8a42b70
did we load the same asm twice??
lambdageek May 12, 2023
ccd074e
Merge remote-tracking branch 'origin/main' into webcil-wasm-wrapper
lambdageek May 15, 2023
6166a67
checkpoint. things are broken. but I adjusted MonoImage:raw_data
lambdageek May 12, 2023
eedf447
align webcil payload to a 4-byte boundary within the wasm module
lambdageek May 15, 2023
d9b396e
remove WIP tracing
lambdageek May 16, 2023
89ef589
assert that webcil raw data is 4-byte aligned
lambdageek May 16, 2023
d9a3fea
revert unrelated build change
lambdageek May 16, 2023
6de7dee
revert unrelated change
lambdageek May 16, 2023
88f956e
revert whitespace
lambdageek May 16, 2023
0542c51
revert WBT debugging output changes
lambdageek May 16, 2023
7c0643d
add 4-byte alignment requirement to the webcil spec
lambdageek May 16, 2023
e187b00
Don't modify MonoImageStorage:raw_data
lambdageek May 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 65 additions & 10 deletions docs/design/mono/webcil.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,74 @@

## Version

This is version 0.0 of the Webcil format.
This is version 0.0 of the Webcil payload format.
This is version 0 of the WebAssembly module Webcil wrapper.

## Motivation

When deploying the .NET runtime to the browser using WebAssembly, we have received some reports from
customers that certain users are unable to use their apps because firewalls and anti-virus software
may prevent browsers from downloading or caching assemblies with a .DLL extension and PE contents.

This document defines a new container format for ECMA-335 assemblies
that uses the `.webcil` extension and uses a new WebCIL container
format.
This document defines a new container format for ECMA-335 assemblies that uses the `.wasm` extension
and uses a new WebCIL metadata payload format wrapped in a WebAssembly module.


## Specification

### Webcil WebAssembly module

Webcil consists of a standard [binary WebAssembly version 0 module](https://webassembly.github.io/spec/core/binary/index.html) containing the following WAT module:

``` wat
(module
(data "\0f\00\00\00") ;; data segment 0: payload size as a 4 byte LE uint32
(data "webcil Payload\cc") ;; data segment 1: webcil payload
(memory (import "webcil" "memory") 1)
(global (export "webcilVersion") i32 (i32.const 0))
(func (export "getWebcilSize") (param $destPtr i32) (result)
local.get $destPtr
i32.const 0
i32.const 4
memory.init 0)
(func (export "getWebcilPayload") (param $d i32) (param $n i32) (result)
local.get $d
i32.const 0
local.get $n
memory.init 1))
```

That is, the module imports linear memory 0 and exports:
* a global `i32` `webcilVersion` encoding the version of the WebAssembly wrapper (currently 0),
* a function `getWebcilSize : i32 -> ()` that writes the size of the Webcil payload to the specified
address in linear memory as a `u32` (that is: 4 LE bytes).
* a function `getWebcilPayload : i32 i32 -> ()` that writes `$n` bytes of the content of the Webcil
payload at the spcified address `$d` in linear memory.

The Webcil payload size and payload content are stored in the data section of the WebAssembly module
as passive data segments 0 and 1, respectively. The module must not contain additional data
segments. The module must store the payload size in data segment 0, and the payload content in data
segment 1.

(**Rationale**: With this wrapper it is possible to split the WebAssembly module into a *prefix*
consisting of everything before the data section, the data section, and a *suffix* that consists of
everything after the data section. The prefix and suffix do not depend on the contents of the
Webcil payload and a tool that generates Webcil files could simply emit the prefix and suffix from
constant data. The data section is the only variable content between different Webcil-encoded .NET
assemblies)

(**Rationale**: Encoding the payload in the data section in passive data segments with known indices
allows a runtime that does not include a WebAssembly host or a runtime that does not wish to
instantiate the WebAssembly module to extract the payload by traversing the WebAssembly module and
locating the Webcil payload in the data section at segment 1.)

(**Note**: the wrapper may be versioned independently of the payload.)


### Webcil payload

The webcil payload contains the ECMA-335 metadata, IL and resources comprising a .NET assembly.

As our starting point we take section II.25.1 "Structure of the
runtime file format" from ECMA-335 6th Edition.

Expand All @@ -40,12 +93,12 @@ A Webcil file follows a similar structure
| CLI Data |
| |

## Webcil Headers
### Webcil Headers

The Webcil headers consist of a Webcil header followed by a sequence of section headers.
(All multi-byte integers are in little endian format).

### Webcil Header
#### Webcil Header

``` c
struct WebcilHeader {
Expand Down Expand Up @@ -75,11 +128,11 @@ The next pairs of integers are a subset of the PE Header data directory specifyi
of the CLI header, as well as the directory entry for the PE debug directory.


### Section header table
#### Section header table

Immediately following the Webcil header is a sequence (whose length is given by `coff_sections`
above) of section headers giving their virtual address and virtual size, as well as the offset in
the Webcil file and the size in the file. This is a subset of the PE section header that includes
the Webcil payload and the size in the file. This is a subset of the PE section header that includes
enough information to correctly interpret the RVAs from the webcil header and from the .NET
metadata. Other information (such as the section names) are not included.

Expand All @@ -92,11 +145,13 @@ struct SectionHeader {
};
```

### Sections
(**Note**: the `st_raw_data_ptr` member is an offset from the beginning of the Webcil payload, not from the beginning of the WebAssembly wrapper module.)

#### Sections

Immediately following the section table are the sections. These are copied verbatim from the PE file.

## Rationale
### Rationale

The intention is to include only the information necessary for the runtime to locate the metadata
root, and to resolve the RVA references in the metadata (for locating data declarations and method IL).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ FilePosition SectionStart

private string InputPath => _inputPath;

public bool WrapInWebAssembly { get; set; } = true;

private WebcilConverter(string inputPath, string outputPath)
{
_inputPath = inputPath;
Expand All @@ -62,6 +64,25 @@ public void ConvertToWebcil()
}

using var outputStream = File.Open(_outputPath, FileMode.Create, FileAccess.Write);
if (!WrapInWebAssembly)
{
WriteConversionTo(outputStream, inputStream, peInfo, wcInfo);
}
else
{
// if wrapping in WASM, write the webcil payload to memory because we need to discover the length

// webcil is about the same size as the PE file
using var memoryStream = new MemoryStream(checked((int)inputStream.Length));
WriteConversionTo(memoryStream, inputStream, peInfo, wcInfo);
lambdageek marked this conversation as resolved.
Show resolved Hide resolved
var wrapper = new WebcilWasmWrapper(memoryStream);
memoryStream.Seek(0, SeekOrigin.Begin);
wrapper.WriteWasmWrappedWebcil(outputStream);
}
}

public void WriteConversionTo(Stream outputStream, FileStream inputStream, PEFileInfo peInfo, WCFileInfo wcInfo)
{
WriteHeader(outputStream, wcInfo.Header);
WriteSectionHeaders(outputStream, wcInfo.SectionHeaders);
CopySections(outputStream, inputStream, peInfo.SectionHeaders);
Expand Down Expand Up @@ -210,7 +231,7 @@ private static void WriteStructure<T>(Stream s, T structure)
}
#endif

private static void CopySections(FileStream outStream, FileStream inputStream, ImmutableArray<SectionHeader> peSections)
private static void CopySections(Stream outStream, FileStream inputStream, ImmutableArray<SectionHeader> peSections)
{
// endianness: ok, we're just copying from one stream to another
foreach (var peHeader in peSections)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System;
using System.IO;
using System.Collections.Immutable;
using System.Reflection.PortableExecutable;
using System.Runtime.InteropServices;

namespace Microsoft.NET.WebAssembly.Webcil;

//
// Emits a simple WebAssembly wrapper module around a given webcil payload.
//
// The entire wasm module is going to be unchanging, except for the data section which has 2 passive
// segments. segment 0 is 4 bytes and contains the length of the webcil payload. segment 1 is of a
// variable size and contains the webcil payload.
//
// The unchanging parts are stored as a "prefix" and "suffix" which contain the bytes for the following
// WAT module, split into the parts that come before the data section, and the bytes that come after:
//
// (module
// (data "\0f\00\00\00") ;; data segment 0: payload size as a 4 byte LE uint32
kg marked this conversation as resolved.
Show resolved Hide resolved
// (data "webcil Payload\cc") ;; data segment 1: webcil payload
// (memory (import "webcil" "memory") 1)
// (global (export "webcilVersion") i32 (i32.const 0))
// (func (export "getWebcilSize") (param $destPtr i32) (result)
// local.get $destPtr
// i32.const 0
// i32.const 4
// memory.init 0)
// (func (export "getWebcilPayload") (param $d i32) (param $n i32) (result)
// local.get $d
// i32.const 0
// local.get $n
// memory.init 1))
public class WebcilWasmWrapper
{
private readonly Stream _webcilPayloadStream;
private readonly uint _webcilPayloadSize;

public WebcilWasmWrapper(Stream webcilPayloadStream)
{
_webcilPayloadStream = webcilPayloadStream;
long len = webcilPayloadStream.Length;
if (len > (long)uint.MaxValue)
throw new InvalidOperationException("webcil payload too large");
_webcilPayloadSize = (uint)len;
}

public void WriteWasmWrappedWebcil(Stream outputStream)
{
WriteWasmHeader(outputStream);
using (var writer = new BinaryWriter(outputStream, System.Text.Encoding.UTF8, leaveOpen: true))
{
WriteDataSection(writer);
}
WriteWasmSuffix(outputStream);
}

//
// Everything from the above wat module before the data section
//
// extracted by wasm-reader -s wrapper.wasm
private static
#if NET7_0_OR_GREATER
ReadOnlyMemory<byte>
#else
byte[]
#endif
s_wasmWrapperPrefix = new byte[] {
0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00, 0x01, 0x0a, 0x02, 0x60, 0x01, 0x7f, 0x00, 0x60, 0x02, 0x7f, 0x7f, 0x00, 0x02, 0x12, 0x01, 0x06, 0x77, 0x65, 0x62, 0x63, 0x69, 0x6c, 0x06, 0x6d,
pavelsavara marked this conversation as resolved.
Show resolved Hide resolved
0x65, 0x6d, 0x6f, 0x72, 0x79, 0x02, 0x00, 0x01, 0x03, 0x03, 0x02, 0x00, 0x01, 0x06, 0x0b, 0x02, 0x7f, 0x00, 0x41, 0x00, 0x0b, 0x7f, 0x00, 0x41, 0x00, 0x0b, 0x07, 0x41, 0x04, 0x0d, 0x77, 0x65,
0x62, 0x63, 0x69, 0x6c, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6f, 0x6e, 0x03, 0x00, 0x0a, 0x77, 0x65, 0x62, 0x63, 0x69, 0x6c, 0x53, 0x69, 0x7a, 0x65, 0x03, 0x01, 0x0d, 0x67, 0x65, 0x74, 0x57, 0x65,
0x62, 0x63, 0x69, 0x6c, 0x53, 0x69, 0x7a, 0x65, 0x00, 0x00, 0x10, 0x67, 0x65, 0x74, 0x57, 0x65, 0x62, 0x63, 0x69, 0x6c, 0x50, 0x61, 0x79, 0x6c, 0x6f, 0x61, 0x64, 0x00, 0x01, 0x0c, 0x01, 0x02,
0x0a, 0x1b, 0x02, 0x0c, 0x00, 0x20, 0x00, 0x41, 0x00, 0x41, 0x04, 0xfc, 0x08, 0x00, 0x00, 0x0b, 0x0c, 0x00, 0x20, 0x00, 0x41, 0x00, 0x20, 0x01, 0xfc, 0x08, 0x01, 0x00, 0x0b,
};
//
// Everything from the above wat module after the data section
//
// extracted by wasm-reader -s wrapper.wasm
private static
#if NET7_0_OR_GREATER
ReadOnlyMemory<byte>
#else
byte[]
#endif
s_wasmWrapperSuffix = new byte[] {
0x00, 0x1b, 0x04, 0x6e, 0x61, 0x6d, 0x65, 0x02, 0x14, 0x02, 0x00, 0x01, 0x00, 0x07, 0x64, 0x65, 0x73, 0x74, 0x50, 0x74, 0x72, 0x01, 0x02, 0x00, 0x01, 0x64, 0x01, 0x01, 0x6e,
};

private static void WriteWasmHeader(Stream outputStream)
{
#if NET7_0_OR_GREATER
outputStream.Write(s_wasmWrapperPrefix.Span);
#else
outputStream.Write(s_wasmWrapperPrefix, 0, s_wasmWrapperPrefix.Length);
#endif
}

private static void WriteWasmSuffix(Stream outputStream)
{
#if NET7_0_OR_GREATER
outputStream.Write(s_wasmWrapperSuffix.Span);
#else
outputStream.Write(s_wasmWrapperSuffix, 0, s_wasmWrapperSuffix.Length);
#endif
}

// 1 byte to encode "passive" data segment
private const uint SegmentCodeSize = 1;

private void WriteDataSection(BinaryWriter writer)
{
uint dataSectionSize = 0;
// uleb128 encoding of number of segments
dataSectionSize += 1; // there's always 2 segments which encodes to 1 byte
// compute the segment 0 size:
// segment 0 has 1 byte segment code, 1 byte of size and 4 bytes of payload
dataSectionSize += SegmentCodeSize + 1 + 4;

// encode webcil size as a uleb128
byte[] ulebSegmentSize = ULEB128Encode(_webcilPayloadSize);

// compute the segment 1 size:
// segment 1 has 1 byte segment code, a uleb128 encoding of the webcilPayloadSize, and the payload
checked
{
dataSectionSize += SegmentCodeSize + (uint)ulebSegmentSize.Length + _webcilPayloadSize;
}

byte[] ulebSectionSize = ULEB128Encode(dataSectionSize);

writer.Write((byte)11); // section Data
writer.Write(ulebSectionSize, 0, ulebSectionSize.Length);

writer.Write((byte)2); // number of segments

// write segment 0
writer.Write((byte)1); // passive segment
writer.Write((byte)4); // segment size: 4
writer.Write((uint)_webcilPayloadSize); // payload is an unsigned 32 bit number

// write segment 1
writer.Write((byte)1); // passive segment
writer.Write(ulebSegmentSize, 0, ulebSegmentSize.Length); // segment size: _webcilPayloadSize
_webcilPayloadStream.CopyTo(writer.BaseStream); // payload is the entire webcil content
}

private static byte[] ULEB128Encode(uint value)
{
uint n = value;
int len = 0;
do
{
n >>= 7;
len++;
} while (n != 0);
byte[] arr = new byte[len];
int i = 0;
n = value;
do
{
byte b = (byte)(n & 0x7f);
n >>= 7;
if (n != 0)
b |= 0x80;
arr[i++] = b;
} while (n != 0);
return arr;
}
}
14 changes: 14 additions & 0 deletions src/mono/mono/metadata/assembly.c
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
#include <mono/utils/atomic.h>
#include <mono/utils/mono-os-mutex.h>
#include <mono/metadata/mono-private-unstable.h>
#include <mono/metadata/webcil-loader.h>

#ifndef HOST_WIN32
#include <sys/types.h>
Expand Down Expand Up @@ -1465,6 +1466,13 @@ bundled_assembly_match (const char *bundled_name, const char *name)
if (bprefix == nprefix && strncmp (bundled_name, name, bprefix) == 0)
return TRUE;
}
/* if they want a .dll and we have the matching .wasm webcil-in-wasm, return it */
if (g_str_has_suffix (bundled_name, MONO_WEBCIL_IN_WASM_EXTENSION) && g_str_has_suffix (name, ".dll")) {
size_t bprefix = strlen (bundled_name) - strlen (MONO_WEBCIL_IN_WASM_EXTENSION);
size_t nprefix = strlen (name) - strlen (".dll");
if (bprefix == nprefix && strncmp (bundled_name, name, bprefix) == 0)
return TRUE;
}
return FALSE;
#endif
}
Expand Down Expand Up @@ -2737,6 +2745,12 @@ mono_assembly_load_corlib (void)
corlib = mono_assembly_request_open (corlib_name, &req, &status);
g_free (corlib_name);
}
if (!corlib) {
/* Maybe its in a bundle */
char *corlib_name = g_strdup_printf ("%s%s", MONO_ASSEMBLY_CORLIB_NAME, MONO_WEBCIL_IN_WASM_EXTENSION);
corlib = mono_assembly_request_open (corlib_name, &req, &status);
g_free (corlib_name);
}
#endif
g_assert (corlib);

Expand Down
5 changes: 3 additions & 2 deletions src/mono/mono/metadata/image.c
Original file line number Diff line number Diff line change
Expand Up @@ -958,8 +958,9 @@ mono_has_pdb_checksum (char *raw_data, uint32_t raw_data_len)
int32_t ret = try_load_pe_cli_header (raw_data, raw_data_len, &cli_header);

#ifdef ENABLE_WEBCIL
int32_t webcil_section_adjustment = 0;
if (ret == -1) {
ret = mono_webcil_load_cli_header (raw_data, raw_data_len, 0, &cli_header);
ret = mono_webcil_load_cli_header (raw_data, raw_data_len, 0, &cli_header, &webcil_section_adjustment);
is_pe = FALSE;
}
#endif
Expand Down Expand Up @@ -992,7 +993,7 @@ mono_has_pdb_checksum (char *raw_data, uint32_t raw_data_len)
}
#ifdef ENABLE_WEBCIL
else {
ret = mono_webcil_load_section_table (raw_data, raw_data_len, ret, &t);
ret = mono_webcil_load_section_table (raw_data, raw_data_len, ret, webcil_section_adjustment, &t);
if (ret == -1)
return FALSE;
}
Expand Down
Loading