Get rid of wasm-gc #1262

pepyakin · 2018-12-12T17:20:04Z

wasm-gc got deprecated but we still use it in our build pipeline.

The author mentions that:

The Rust compiler now natively supports --gc-sections when linking wasm executables, which means wasm executables already have 90% of their garbage removed when coming out of the compiler.

But for us this is not entirely true and the size of the runtime is:

882K baseline
636K with wasm-gc

And

The wasm-pack (and wasm-bindgen) project will already run this by default for you, so there's no need to run it again.

But we don't use neither of these.

We need to figure out what do we want to use.

pepyakin · 2018-12-13T17:55:12Z

According to twiggy diff node_runtime.wasm node_runtime.compact.wasm:

 Delta Bytes │ Item
─────────────┼─────────────────────────────────
     -252601 ┊ <total>
      -68707 ┊ custom section '.debug_str'
      -56447 ┊ custom section '.debug_line'
      -52165 ┊ custom section '.debug_info'
      -16040 ┊ custom section '.debug_ranges'
      -10261 ┊ custom section '.debug_pubnames'
       -6072 ┊ custom section '.debug_pubtypes'
       -1432 ┊ custom section '.debug_abbrev'
         -20 ┊ "function names" subsection
         -18 ┊ custom section '.debug_macinfo'
          -3 ┊ __wasm_call_ctors
          -1 ┊ elem[0]
          -1 ┊ func[357]

so major source of the cruft is debug_ sections.

There is a tool called wasm-strip in the wabt suite, which strips all custom sections.

882K baseline
643K after wasm-strip
636K after wasm-gc

however, if we decide to depend on wabt in our build pipeline we could then use wabt's wasm-opt, which could optimize binary (including pruning of dead symbols) and produce a binary with size of 561K.

bkchr · 2018-12-14T07:09:31Z

Why do we have debug sections in a release build? Shouldn't they be removed?

pepyakin · 2018-12-16T19:30:18Z

I'd say they should be removed! But for now this is not the case (and it has been this way for a while).
And I'm not sure why is this that way, but I asked why.

pepyakin · 2019-10-02T14:31:14Z

So, a quick recap:

substrate uses a wasm file to encode logic of blockchain. We call it wasm runtime
an example wasm runtime is written in rust, built with cargo. Rust uses LLVM LLD linker to link rust libraries and produce a final wasm blob
by default, rustc puts a quite some junk into the final binary and that junk is unnecessary.
to strip this unnecessary stuff, we use wasm-gc.

There is a chance that since the last time we checked the things have changed. Notably, there is a LLD flag -s which can be used to strip all unnecessary sections.

To check if that actually helps we need:

Disable wasm-gc. We can comment it on the line below just for the experiment.

substrate/core/utils/wasm-builder/src/wasm_project.rs

Lines 354 to 360 in 7010ec7

    
           let res = Command::new("wasm-gc") 
        
           	.arg(&wasm_file) 
        
           	.arg(&wasm_compact_file) 
        
           	.status() 
        
           	.map(|s| s.success());

Try build the wasm runtime passing this flag to LLD and disable wasm-gc. This can be achieved by building substrate with the following command:
```
WASM_BUILD_RUSTFLAGS='-C link-arg=-s' cargo build --release
```

You should be able to find the runtime files under target/release/wbuild directory. Keep in mind that if you skip wasm-gc then there won't be the .compact.wasm.

If the results turn out to be satisfying, we can then get rid of wasm-gc enitrely.

kianenigma · 2019-10-03T13:59:03Z

with wasm-gc

// compact 
→ pwd
/Users/kianenigma/Desktop/Parity/substrate/target/release/wbuild/node-runtime
 → du -sk node_runtime.compact.wasm 
1156	node_runtime.compact.wasm

// Original
→ pwd                      
.../substrate/target/release/wbuild/target/wasm32-unknown-unknown/release
→ du -sk node_runtime.wasm        
1480	node_runtime.wasm

without wasm-gc

// run with `let res: Result<bool, std::io::Error> = Ok(true);`
→ pwd
/substrate/target/release/wbuild/target/wasm32-unknown-unknown/release
→ du -sk node_runtime.wasm
1168	node_runtime.wasm

for which, after a cargo clean, I saved the compact file of the previous build, pasted it back here to silent this error

error: couldn't read /Users/kianenigma/Desktop/Parity/substrate/target/release/wbuild/node-runtime/node_runtime.compact.wasm: No such file or directory (os error 2)
 --> /Users/kianenigma/Desktop/Parity/substrate/target/release/build/node-runtime-de07cf4da2bc533b/out/wasm_binary.rs:2:36
  |
2 |                 pub const WASM_BINARY: &[u8] = include_bytes!("/Users/kianenigma/Desktop/Parity/substrate/target/release/wbuild/node-runtime/node_runtime.compact.wasm");
  |                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

and then ran the command with the linker flag. but I am not 100% sure if I have done it correctly. I pasted the cwd as well for someone to check.

Swader · 2019-10-04T08:40:35Z

Just a note that the test suite depends on wasm-gc and the prerequisites installation script doesn't pull it in. So to reduce new user friction, it should either be added to that one, or removed from the tests imo.

bkchr · 2019-10-04T08:43:44Z

Which script? The init script installs it.
Nevertheless, it is documented in the readme and having clear install instructions is always better than having any scripts that work on one distribution.

Swader · 2019-10-04T09:05:36Z

You're right, I've been relying too much on the getsubstrate.io script which does not pull it in, but the README does document it.

bkchr · 2019-10-04T09:24:55Z

@shawntabrizi ^^^

bkchr · 2019-10-04T09:25:23Z

(That is also a script I really would like to get rid of!)

Swader · 2019-10-04T09:33:15Z

I ran into it from the Substrate Kitties workshop and here.

As a tech ed I'm all for simplification and abstracting bootstrapping from the user, but I do agree that if we cannot provide a coherent experience to newcomers, such a script shouldn't be a thing. Maybe something like a vagrant box with all the prerequisites installed would be better.

shawntabrizi · 2019-10-04T14:12:46Z

We can and should update the getsubstrate.io script for the time being.

@bkchr I am 100% with you, let's get rid of all the scripts. For now though, I don't see a good solution for installing all these prerequisites.

Maybe the best thing is to literally have an install page with each operating system and instructions?

@Swader vagrant is a virtual env?

shawntabrizi · 2019-10-04T14:25:24Z

Here is a PR to update the getsubstrate.io script for now: paritytech/scripts#101

Swader · 2019-10-04T14:31:46Z

@shawntabrizi vagrant is a VM tool for composing and running reproducible headless VMs of little size and resource intensity. For a good while I was maintaining https://github.com/Swader/homestead_improved

Basically it lets you pre-install stuff with provisioning scripts (shell, ansible, etc.) into a headless distro like a 200MB ubuntu, similar to what you might find in WSL, but it's not raw - it's all the prereqs in place. So no downloading rust, running updates, etc. The organization / author keeps the box up to date, the users merely run vagrant box update once in a while, though this is optional because once they run an instance of the VM they can just update it as any other headless VM.

Here's a better explained rundown of mine from ages ago: https://www.sitepoint.com/re-introducing-vagrant-right-way-start-php/ - mentions PHP but has nothing to do with it specifically.

In a nutshell, all OSes then have the exact same thing. On windows, it works exactly the same as on OS X or Linux. So suddenly all of your users have to follow exactly one set of instructions which always works (download vagrant, oraclevm and this box) and they have the same env from which to report bugs, which is coincidentally identical to the env on which you can then reproduce them.

kianenigma · 2019-10-05T09:15:08Z

@Swader if you are recommending us maintaining vagrant boxes (or similar IaC tools) that'd be cool. We already have docker builds that can be used mostly to play with the ui and less to hack substrate itself.

But, such things would never be enough. We need manual installation as well, always.

Swader · 2019-10-05T14:39:33Z

I agree. Let me see if I can put together a box, I could use one myself so I'm sure I could dedicate part of my time to maintaining it. We can evaluate once done.

DrSensor · 2019-10-19T07:24:13Z

Hi, is it okay if I work on this issue?

bkchr · 2019-10-19T08:45:11Z

You can work on this, but our experiments show that you probably will not find anything that achieves the same result.

shawntabrizi · 2021-03-25T15:35:06Z

@pepyakin this should be closed right?

pepyakin · 2021-03-25T17:08:17Z

I wasn't following for what is happening on the wasm-builder side lately, but just a quick glance revealed that it still does invoke wasm-gc. So apparently it is still being used.

athei · 2022-01-21T11:23:36Z

I don't get it. AFAIK wasm builder builds the runtime with lto in release profile already. How to do you see those differences with only lto enabled:

substrate/utils/wasm-builder/src/wasm_project.rs

Line 231 in 48e0b41

release_profile.insert("lto".into(), true.into());

koute · 2022-01-21T12:57:02Z

I don't get it. AFAIK wasm builder builds the runtime with lto in release profile already. How to do you see those differences with only lto enabled:

Just guessing - probably because of the extra codegen-units=1, and also AFAIK by default lto runs in thinlto mode, doesn't it? (So it'd be different than lto=fat.)

The build time metric is a bit daunting.

Hmm.... indeed, although extra 40s doesn't seem that bad considering how long everything else takes to build.

Maybe we should have separate cargo profiles for a proper release build and a "development" release build?

pepyakin · 2022-01-21T13:03:00Z

Maybe we should have separate cargo profiles for a proper release build and a "development" release build?

Yes, I was thinking about something along these lines too. Alternatively, maybe finally get the debug builds (with lots of opt-level overrides for performance critical crates).

Hmm.... indeed, although extra 40s doesn't seem that bad considering how long everything else takes to build.

The problem is that it's 40 seconds for one runtime. In polkadot we have 4 ༼ ༎ຶ ෴ ༎ຶ༽

I envision that with #7288 we will be able to split the runtimes out so those will be buildable with separate cargo xbuild-runtime one-by-one, or something along these lines, which will bring us to only 40s

koute · 2022-01-21T13:05:44Z

Hmm.... indeed, although extra 40s doesn't seem that bad considering how long everything else takes to build.

The problem is that it's 40 seconds for one runtime. In polkadot we have 4 ༼ ༎ຶ ෴ ༎ຶ༽

Yes but, wouldn't they be compiled in parallel?

athei · 2022-01-21T13:08:02Z

Just guessing - probably because of the extra codegen-units=1

But @nazar-pc reports a difference for lto="fat" only. This is what confuses me.

and also AFAIK by default lto runs in thinlto mode, doesn't it? (So it'd be different than lto=fat.)

Not according to this documentation. AFAIK true == fat.

pepyakin · 2022-01-21T13:08:19Z

Yes but, wouldn't they be compiled in parallel?

Oh, right, that's true

nazar-pc · 2022-01-21T13:09:21Z

But @nazar-pc reports a difference for lto="fat" only. This is what confuses me.

I think this is caused by incremental compilation, I did not wipe target in between those initial tests until I got to measure time.

Yes but, wouldn't they be compiled in parallel?

Assuming you have enough CPU cores of course 😉

athei · 2022-01-21T13:10:20Z

I think this is caused by incremental compilation, I did not wipe target in between those initial tests until I got to measure time.

You also reported different code sizes.

nazar-pc · 2022-01-21T13:20:51Z

You also reported different code sizes.

I just re-ran it a few times and I do reproducibly get different sizes with and without .append_to_rust_flags("-C lto=fat") added to WasmBuilder call.

I don't think incremental compilation in Rust guarantees for builds from scratch to be fully identical to some incremental rebuilds, but this is not it regardless.

nazar-pc · 2022-01-21T13:36:46Z

I checked commands that are being generated, and with release_profile.insert("lto".into(), true.into()); you only get -C lto, which explains the difference.

I guess you need "true".into() to get what you expected to get. I don't like those weak and mixed types parameters, they are confusing 😞

koute · 2022-01-21T13:52:48Z

Yes but, wouldn't they be compiled in parallel?

Assuming you have enough CPU cores of course wink

Well, indeed I do, whole 32 of them. (:

Anyhow, this seems like it's something we should measure. Maybe you could also compile polkadot (so all of the runtimes get compiled) while limiting the compilation to a certain number of cores (on Linux you should be able to do this through taskset) and compare how the compile time changes assuming we have N cores?

nazar-pc · 2022-01-21T15:16:56Z

Polkadot compilation on the same machine as above, also attaching timing results.

Status quo:

   Completed rococo-runtime v0.9.13 build script (run) in 143.0s
   Completed westend-runtime v0.9.13 build script (run) in 148.2s
   Completed polkadot-runtime v0.9.13 build script (run) in 150.6s
   Completed kusama-runtime v0.9.13 build script (run) in 151.9s
...
    Finished release [optimized] target(s) in 6m 15s

cargo-timing-before.html.zip

-C lto=fat -C codegen-units=1:

   Completed rococo-runtime v0.9.13 build script (run) in 169.4s
   Completed westend-runtime v0.9.13 build script (run) in 187.0s
   Completed polkadot-runtime v0.9.13 build script (run) in 193.7s
   Completed kusama-runtime v0.9.13 build script (run) in 212.5s
...
    Finished release [optimized] target(s) in 7m 06s

cargo-timing-after.html.zip

Runtime compilation into WASM is a single-threaded process, so with 4+ cores time overhead doesn't add up and compilation happens in parallel.

bkchr · 2022-01-21T17:14:43Z

We could just remove wasm-gc and do the optimal build only for when we prepare a release. We already do this for stuff like log where we disable it completely for the on chain build to reduce the final binary size.

athei · 2022-01-28T16:57:33Z

We could just remove wasm-gc and do the optimal build only for when we prepare a release. We already do this for stuff like log where we disable it completely for the on chain build to reduce the final binary size.

#10747

However, I kept wasm-gc around for now as it still reduces the file size. wasm-opt needs to be applied first but I didn't want to conflate that in one PR.

I am not sure how important code size is but we could also look into compiling the runtime with opt-level = z.

bkchr · 2022-01-28T20:54:59Z

could also look into compiling the runtime with opt-level = z.

I have done this 2 years ago or whatever and this wasn't bringing any good results. I think it was even slower.

athei · 2022-01-28T21:15:02Z

could also look into compiling the runtime with opt-level = z.

I have done this 2 years ago or whatever and this wasn't bringing any good results. I think it was even slower.

This isn't surprising. It optimizes for smaller code rather than "fast code". Often that isn't necessarily slower. I am not sure how important a smaller runtime would be and if it makes sense to sacrifice some performance. opt-level = s could be a good middle ground.

ggwpez · 2022-01-28T22:32:59Z

This isn't surprising. It optimizes for smaller code rather than "fast code". Often that isn't necessarily slower. I am not sure how important a smaller runtime would be and if it makes sense to sacrifice some performance. opt-level = s could be a good middle ground.

There are cases where the performance improves since the smaller binary causes less cache misses when fetching the instructions. Would have to test it on the exact hardware.
Rust also seems to have guided Optimizations to get some automatic cache optimization.
But I'm not sure if this is actually a good idea, since we try to get worst-cast performance and the Guided Optimizations makes some assumptions about the normal program flow.
This could lead to good benchmarking performance but worse performance when we hit a branch that was not automatically optimized.

bkchr · 2022-01-29T00:03:29Z

IMO we should not over-complicate it. Just keep the compiler optimizing for performance, remove stuff in the final binary that we don't need and then we compress it anyway again.

nazar-pc · 2022-04-01T05:27:15Z

-C lto=fat -C codegen-units=1 is already used in production profile, does that mean there is nothing preventing wasm-gc from being removed?

athei · 2022-04-01T11:25:42Z

I think it sill reduces the code size a bit. Didn't check by how much, though.

nazar-pc · 2022-04-01T22:57:14Z

The difference right now is roughly this:

zstd polkadot_runtime.wasm 1_182_538 bytes
zstd polkadot_runtime.compact.wasm 1_120_459 bytes

So it is only ~5.2% increase, which is significant, but not deal-breaker, especially considering that things got smaller after -C lto=fat -C codegen-units=1 introduction.

pepyakin · 2022-04-05T12:59:20Z

i wonder where this difference comes from?

can you run twiggy diff over those two?

nazar-pc · 2022-04-05T16:30:45Z

Uncompressed size:

polkadot_runtime.wasm is 4_954_045 bytes
polkadot_runtime.compact.wasm is 4_527_986 bytes

$ twiggy diff polkadot_runtime.wasm polkadot_runtime.compact.wasm 
 Delta Bytes │ Item
─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
     +300764 ┊ "function names" subsection
      -51119 ┊ custom section '.debug_str'
      -23698 ┊ custom section '.debug_info'
      -22742 ┊ custom section '.debug_line'
      -16368 ┊ custom section '.debug_pubnames'
       -8873 ┊ <polkadot_runtime::Call as frame_support::traits::dispatch::UnfilteredDispatchable>::dispatch_bypass_filter::hb866e76c7c626a49
       -7840 ┊ custom section '.debug_ranges'
       -7772 ┊ polkadot_runtime::Runtime::metadata::h169c0e3519f6dcf9
       -3199 ┊ polkadot_runtime::_::<impl parity_scale_codec::codec::Encode for polkadot_runtime::Call>::encode_to::h4d530ea869712805
       -3018 ┊ <xcm_executor::XcmExecutor<Config> as xcm::v2::traits::ExecuteXcm<<Config as xcm_executor::config::Config>::Call>>::execute_xcm_in_credit::h3f8e6465ee293b4b
       -2832 ┊ <sp_runtime::generic::era::Era as scale_info::TypeInfo>::type_info::h1a1dae6524123429
       -2685 ┊ polkadot_runtime::_::<impl parity_scale_codec::codec::Encode for polkadot_runtime::Event>::encode_to::h783a8bffe709a93e
       -2395 ┊ <(TupleElement0,TupleElement1) as frame_support::traits::hooks::OnRuntimeUpgrade>::on_runtime_upgrade::he126fdfea7233115
       -2321 ┊ polkadot_runtime_parachains::initializer::<impl polkadot_runtime_parachains::initializer::pallet::Pallet<T>>::apply_new_session::hdd9528bf14999f39
       -2194 ┊ <pallet_staking::pallet::pallet::Call<T> as frame_support::traits::dispatch::UnfilteredDispatchable>::dispatch_bypass_filter::h1f225e22d27b0cf4
       -2128 ┊ pallet_session::<impl pallet_session::pallet::Pallet<T>>::rotate_session::h4ec98104757f4ffe
       -1841 ┊ <xcm::v1::multilocation::MultiLocation as core::convert::TryFrom<xcm::v0::multi_location::MultiLocation>>::try_from::h7793c69f1b61502b
       -1817 ┊ custom section '.debug_abbrev'
       -1731 ┊ pallet_election_provider_multi_phase::unsigned::<impl pallet_election_provider_multi_phase::pallet::Pallet<T>>::prepare_election_result::he31bdc622f13337b
       -1435 ┊ <(TupleElement0,TupleElement1) as frame_support::traits::hooks::OnInitialize<BlockNumber>>::on_initialize::hf592aba116c5ad05
     -200248 ┊ ... and 2740 more.
     -426059 ┊ Σ [2760 Total Rows]

…trigger

pepyakin added the M0-build label Dec 12, 2018

pepyakin added this to the As-and-when milestone Dec 12, 2018

gavofyork added the I7-refactor Code needs refactoring. label Dec 18, 2018

pepyakin added the hacktoberfest label Oct 2, 2019

pepyakin changed the title ~~wasm-gc is deprecated~~ Get rid of wasm-gc Oct 2, 2019

kianenigma mentioned this issue Oct 15, 2019

Move sr-arithmetic to a new crate and add in a fuzzer #3799

Merged

kianenigma mentioned this issue Oct 19, 2019

Obviate manual dependency on wasm-gc executable #3854

Merged

gavofyork removed M0-build labels Jun 12, 2020

nazar-pc mentioned this issue Jan 21, 2022

Bump wasmi 0.9 -> 0.11 #10702

Closed

xlc mentioned this issue Jun 7, 2022

parity_wasm are different versions #11608

Closed

pepyakin mentioned this issue Jun 16, 2022

wasm-gc depricated, and should not be needed? paritytech/polkadot#5681

Closed

nazar-pc mentioned this issue Sep 17, 2022

WIP: Replace wasm-gc with wasm-opt #12280

Merged

paritytech-processbot bot closed this as completed in #12280 Oct 30, 2022

liuchengxu pushed a commit to liuchengxu/substrate that referenced this issue May 31, 2023

Merge pull request paritytech#1262 from subspace/runtime_wasm_change_…

aa21336

…trigger

altonen mentioned this issue Jun 14, 2023

Rework the event system of sc-network #14197

Open

Get rid of wasm-gc #1262

Get rid of wasm-gc #1262

Comments

pepyakin commented Dec 12, 2018

pepyakin commented Dec 13, 2018

bkchr commented Dec 14, 2018

pepyakin commented Dec 16, 2018

pepyakin commented Oct 2, 2019

kianenigma commented Oct 3, 2019 • edited Loading

Swader commented Oct 4, 2019

bkchr commented Oct 4, 2019

Swader commented Oct 4, 2019

bkchr commented Oct 4, 2019

bkchr commented Oct 4, 2019

Swader commented Oct 4, 2019

shawntabrizi commented Oct 4, 2019

shawntabrizi commented Oct 4, 2019

Swader commented Oct 4, 2019

kianenigma commented Oct 5, 2019

Swader commented Oct 5, 2019

DrSensor commented Oct 19, 2019

bkchr commented Oct 19, 2019

shawntabrizi commented Mar 25, 2021

pepyakin commented Mar 25, 2021

athei commented Jan 21, 2022 • edited Loading

koute commented Jan 21, 2022

pepyakin commented Jan 21, 2022 • edited Loading

koute commented Jan 21, 2022

athei commented Jan 21, 2022

pepyakin commented Jan 21, 2022

nazar-pc commented Jan 21, 2022

athei commented Jan 21, 2022

nazar-pc commented Jan 21, 2022

nazar-pc commented Jan 21, 2022 • edited Loading

koute commented Jan 21, 2022

nazar-pc commented Jan 21, 2022

bkchr commented Jan 21, 2022 • edited Loading

athei commented Jan 28, 2022 • edited Loading

bkchr commented Jan 28, 2022

athei commented Jan 28, 2022 • edited Loading

ggwpez commented Jan 28, 2022

bkchr commented Jan 29, 2022

nazar-pc commented Apr 1, 2022

athei commented Apr 1, 2022

nazar-pc commented Apr 1, 2022

pepyakin commented Apr 5, 2022

nazar-pc commented Apr 5, 2022

kianenigma commented Oct 3, 2019 •

edited

Loading

athei commented Jan 21, 2022 •

edited

Loading

pepyakin commented Jan 21, 2022 •

edited

Loading

nazar-pc commented Jan 21, 2022 •

edited

Loading

bkchr commented Jan 21, 2022 •

edited

Loading

athei commented Jan 28, 2022 •

edited

Loading

athei commented Jan 28, 2022 •

edited

Loading