Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dramatically improving the build time of web-sys #2012

Merged
merged 38 commits into from
Mar 2, 2020

Conversation

Pauan
Copy link
Contributor

@Pauan Pauan commented Feb 23, 2020

This is a pretty big change which touches on a bunch of things, but it shouldn't be a breaking change.

The reason for this PR is that web-sys takes an absurdly long time to build:

1:18 minutes web-sys build.rs
2:01 minutes web-sys compile
4:17 minutes total

As you can see, web-sys alone is taking up 3:19 minutes out of the 4:17 minutes build time. A big chunk of that is running the build.rs script, but even the compilation afterwards takes way too long.

This is in a project which is about as close to "hello world" as it gets:

[dependencies.web-sys]
version = "0.3.35"
features = [
    "console",
    "Window",
]

So it really should not be taking that long. And even worse, it has to do a full recompilation every time you add a new feature to the features list, which happens often during development.

This PR completely fixes that, so now it looks like this:

0:07 minutes web-sys compile
1:03 minutes total

The build.rs script is completely gone, and now it takes a mere 7 seconds to compile web-sys, which means the project now only takes 1:03 minutes to build rather than 4:17 minutes.

This PR accomplishes that by doing the following:

  1. Rather than using a build.rs script to re-parse and re-generate from WebIDL every single time, instead it uses a bin/build-web-sys script to generate the WebIDL once. This script only needs to be re-run when changing the WebIDL.

  2. This new script creates a separate file for each feature, so the "Window" feature corresponds to the web-sys/src/bindings/gen_Window.rs file.

  3. It creates a web-sys/src/bindings/mod.rs file which re-exports all of the gen_ files, and it uses cfg so each file will be behind a feature flag. That means Rust doesn't even need to parse the files which aren't enabled, so everything is much faster.

  4. It adds new cfg flags inside of the gen_ files to accommodate things like Window::navigator which should only be enabled if Navigator is enabled.

    This also means it needed to add some cfg flags to the descriptor glue code generated by #[wasm_bindgen]. I don't particularly like this, but I don't see a better way.

  5. In Cargo.toml it generates the [features] list with appropriate dependencies already specified (so "Window" depends on "EventTarget", etc.)

This is a bit of a rough PR, so there might be some mistakes or a better way to accomplish this, but I think it's a solid start.

This has the following benefits:

  1. Obviously this is far faster, saving 3:12 minutes on build time for web-sys.

  2. The wasm-bindgen-webidl crate is no longer a dependency of web-sys, further improving the build times.

  3. It's no longer necessary to do hacky things like setting the __WASM_BINDGEN_DUMP_FEATURES env var.

  4. Because the generated files are checked into git, that means we get git diffs when changing the WebIDL.

  5. The generated code is properly formatted, so rustdocs will actually be usable now.

  6. I think this moves us a little bit closer to allowing users to generate their own crates based on WebIDL.

To actually run the script, just cd bin/build-web-sys and then do cargo run --release. All of the code will be auto-generated in the right folders. The features list will be in the web-sys/features file, which can then be copied over to the web-sys/Cargo.toml file.

@Pauan
Copy link
Contributor Author

Pauan commented Feb 23, 2020

@alexcrichton I can't figure out how to fix the webidl-tests. When I run cargo test it doesn't run the webidl-tests/build.rs script at all.

@alexcrichton
Copy link
Contributor

Nice wins! I'm definitely eager to get this landed since these are pretty massive wins and are huge boons for beginners first trying out all this!

Before we dive too much into tests though I'd like to get some more information about this if we can.

Correct me if I'm wrong, but it looks like this is pre-generating source code so when you download web-sys from crates.io you don't need to parse any WebIDL, is that right? If so then we originally explicitly actually chose to not do this for the one reason that the generated code from wasm-bindgen is actually unstable with respect to the wasm-bindgen crate itself. The generated code from wasm-bindgen 0.2.0, for example, is not guaranteed to compile against wasm-bindgen 0.2.1 and beyond. At an API level everything is stable, but the internals necessary to get everything working over time have been tweaked.

I think this may be solvable, though, by using an =A.B.C dependency from web-sys to wasm-bindgen. The web-sys generated code will only be compatible with one version of wasm-bindgen, and the = dependency should take care of that. I also don't think this will cause too many practical issues in the ecosystem. I think Cargo should handle this pretty gracefully and still allow for pretty painless upgrades.


Additionally you've posted some timings, but could you break them down a bit? For example in these timings:

1:18 minutes web-sys build.rs
2:01 minutes web-sys compile
4:17 minutes total

what is web-sys build.rs? Does that include compiling the build script or executing the build script? Does that include dependencies of the build script or just the build script itself? Was this in release or debug mode? FWIW we pretty specifically tried to optimize web-sys for build times, and 1:18 for anything other than the tree of dependencies is pretty surprising to me. Additionally 2:01 seems extremely large for the compile time of web-sys, but is this all related to parsing everything and then discarding it? (how having separate files you're saying allows rustc to skip files entirely?)

Also what's the other time in this measurement? The two sub-components don't add up to the total so I'm curious where the total came from?


It sounds like you're updating dependencies between features I think? I don't think we really intentionally avoided that from before, but I thought it was the case that if you enabled Window but not EventTarget you only got the methods for Window and didn't get any methods for EventTarget (and we just left out the dependency chain). Am I misremembering though?


Finally, when going the code generation route, can you add a CI check that web-sys doesn't need to be regenerated? (basically ensuring that what's checked in is up-to-date)

@Pauan
Copy link
Contributor Author

Pauan commented Feb 25, 2020

this is pre-generating source code so when you download web-sys from crates.io you don't need to parse any WebIDL, is that right?

That is correct. The code is fully and completely generated, so all rustc needs to do is load the appropriate .rs files based on the feature flags.

The generated code from wasm-bindgen 0.2.0, for example, is not guaranteed to compile against wasm-bindgen 0.2.1 and beyond.

That's a good point. So rather than generating fully expanded code, it should instead generate code like this:

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(extends = EventTarget)]
    pub type Window;

    // generate methods for Window...
}

I like that a lot better anyways, it's a lot cleaner, and it fixes some of the issues/messiness I ran into. So I'll make that change.

what is web-sys build.rs? Does that include compiling the build script or executing the build script?

I don't know. I simply watched the Cargo build output in the terminal, and when it said web-sys(build) I started a stopwatch, and when it changed to web-sys I stopped the stopwatch. You probably know more about what that means than I do.

I assume it includes the time spent compiling the build.rs script and the time spent actually running the build.rs script.

Does that include dependencies of the build script or just the build script itself?

I saw it compiling wasm-bindgen-webidl, but I did not include that in the times (I started the stopwatch after it had finished compiling). So the times are without any dependencies, they are purely about web-sys itself.

Also what's the other time in this measurement? The two sub-components don't add up to the total so I'm curious where the total came from?

The total time is the total time spent to compile the project. So that means compiling all of the dependencies, and compiling the project itself. So that includes web-sys, but it also includes wasm-bindgen, js-sys, and various other dependencies. In other words, it's the time from starting cargo build until cargo build finishes.

Of course the times are with a clean project (no target directory), so this is a cold build, which means it's the worst case scenario. But it doesn't include download times, since the tests were done with local dependencies (using path in Cargo.toml).

Was this in release or debug mode?

Release mode. I never compile my projects in debug mode, because the runtime performance hit is really huge, and the build times for debug mode aren't that much faster.

FWIW we pretty specifically tried to optimize web-sys for build times, and 1:18 for anything other than the tree of dependencies is pretty surprising to me.

I wasn't surprised that the build.rs took so long, but I was surprised that compiling took so long, since it should have pruned out any non-Window things, so the compilation shouldn't have taken 2 minutes.

Additionally 2:01 seems extremely large for the compile time of web-sys, but is this all related to parsing everything and then discarding it? (how having separate files you're saying allows rustc to skip files entirely?)

My understanding is that the old code should simply have pruned out all of the non-Window code, so it should have been extremely fast, but it obviously isn't. I didn't bother to investigate it too deeply, since the build.rs itself is cripplingly slow, so I figured it doesn't matter why compilation is slow.

I thought it was the case that if you enabled Window but not EventTarget you only got the methods for Window and didn't get any methods for EventTarget (and we just left out the dependency chain). Am I misremembering though?

I'll need to do some checking on that. I think it's possible to use some cfg trickery to make it work without enabling the "EventTarget" feature.

Finally, when going the code generation route, can you add a CI check that web-sys doesn't need to be regenerated? (basically ensuring that what's checked in is up-to-date)

That's a good idea, I'll look into that.

@alexcrichton
Copy link
Contributor

Oh so that was actually another idea we had early on, emitting #[wasm_bindgen] annotations. We concluded though that it was like making a whole separate compiler and decided not to do it due to the work involved. For example there's no implemented way to go from an AST back to what generated that AST today, and the WebIDL stuff basically parses straight into the AST the backend uses. Basically I'm saying that generating #[wasm_bindgen] annotations, which while I agree is probably ideal, is likely a significant chunk of work.

But hey if it can be done it'd be pretty slick :)

Ok thanks for explaining the build timings! My intuition is definitely that for wasm release mode is almost always the same in terms of build time as debug mode, but the proc macros required to build wasm are often significantly more costly in release mode than debug mode in terms of compile time. Cargo's recently stabilize profile overrides feature may help here as well, but I think it's probably best to fix it at the source, it's really hard to beat a 7s compile time with web-sys :).

@Pauan
Copy link
Contributor Author

Pauan commented Feb 26, 2020

I've been making good progress on having it generate #[wasm_bindgen] code. As you said, it is a lot of work, but I think it will be a lot cleaner in the long run, and it lets us clean up a lot of stuff in wasm-bindgen-backend.

Also, after running some tests, if you enable the Window feature in web-sys, then EventTarget will also be available. So generating a feature like Window = ["EventTarget"] is actually correct, since it matches the old behavior.

@alexcrichton
Copy link
Contributor

Very nice! That all sounds great to me 👍

@Pauan
Copy link
Contributor Author

Pauan commented Feb 28, 2020

@alexcrichton I got everything ported to use #[wasm_bindgen]. I'm quite pleased with the result: I think the code is a lot cleaner and easier to understand.

And it actually involved removing large chunks of complex code from wasm-bindgen-backend and replacing it with much simpler code.

Now compiling web-sys takes 10 seconds rather than 7 (because it has to macro-expand the #[wasm_bindgen] attributes). But I think that's perfectly acceptable.

Now I just need to get the unit tests and CI working.

I had to add a new #[wasm_bindgen(typescript_name = Foo)] attribute, since previously that functionality was used by web-sys but it wasn't exposed publicly. Maybe I should rename it to typescript_type, so it doesn't duplicate #1986

I cleaned up the bin script too, so now you can go into the crates/web-sys folder and then run cargo run --release --package wasm-bindgen-webidl webidls src/features. That means it's now possible for users to install the wasm-bindgen-webidl program and run it on their own crates, so it's no longer specific to web-sys.

Copy link
Contributor

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This looks great to me, I'm also pretty happy with how this worked out :)

I'd be fine renaming attributes as appropriate, and adding new attributes is fine by me too. Mind making sure there's docs on them in the guide though?

Also the cleanups look great for the web-sys build script and such, and agreed that this is a great step forward to custom-webidl crates!

crates/webidl/src/main.rs Outdated Show resolved Hide resolved
crates/webidl/src/generator.rs Outdated Show resolved Hide resolved
crates/webidl/src/main.rs Outdated Show resolved Hide resolved
crates/webidl/src/main.rs Outdated Show resolved Hide resolved
crates/webidl/src/main.rs Outdated Show resolved Hide resolved
crates/webidl/src/util.rs Show resolved Hide resolved
crates/web-sys/src/features/gen_HtmlHrElement.rs Outdated Show resolved Hide resolved
@Pauan
Copy link
Contributor Author

Pauan commented Feb 29, 2020

@alexcrichton As a side effect of generating #[wasm_bindgen] annotations, I also needed to make the following changes:

  • Enums now support string values (not just numbers).

  • Full type paths are now supported:

    #[wasm_bindgen]
    extern "C" {
        fn foo(x: path::to::Ty);
    }
  • A new #[wasm_bindgen(typescript_type = "Foo")] attribute for controlling the TypeScript name of a type.

These features already existed, but they were private to wasm-bindgen-backend and could only be used by web-sys. Now they can be used by everybody.

I've also added in docs for typescript_type, and I added a CI script to verify that web-sys is generated correctly. Except it doesn't seem to be working.

Also, the web-sys unit tests are failing with a strange error, and I can't figure out what's causing it.

@alexcrichton
Copy link
Contributor

Those new features sound good to me, thanks!

I'll see if I can try to help and dig into the failures here.

@alexcrichton
Copy link
Contributor

Ok I think that's just a preexisting bug where if you define the same thing within a crate's tests and main crate it causes issues. This didn't previously happen becaue the hash generation for web-sys was "special", but now that it's no longer special it ran afoul of this issue. I just went ahead and removed the Location definitions in the tests and opted to use web-sys instead.

@alexcrichton
Copy link
Contributor

Ok I can reproduce that failure locally. It looks like new HTMLOptionElement(...) doesn't work but the bindings previously used new Option(...). Need to track down where that happened...

@alexcrichton
Copy link
Contributor

Ok I think everything might be working now, we'll see with CI.

I've switch away from custom rustfmt settings to stable rustfmt for everything because I would prefer to keep a simple workflow of cargo fmt working, and it's generated code after all and isn't intended to be the most readable thing in the world.

@alexcrichton
Copy link
Contributor

Oh boy and now it's all green, yay!

@Pauan wanna take a look at the changes I made? Or do you otherwise have anything else that this needs to do before landing?

Instead refactor things so webidl-tests can use the Rust-code-generation
as a library in a build script. Also fixes `cargo fmt` in the
repository.
@Pauan
Copy link
Contributor Author

Pauan commented Mar 2, 2020

@alexcrichton Thanks! I rebased and touched up a few things. I think this is ready for review/merge.

I'm curious why named constructors worked in the past, I thought I had very carefully ported all of the functionality over, so I'm not sure how I missed that.

@alexcrichton
Copy link
Contributor

Ok great! Let's land then. Thanks so much again for doing all the work here!

@alexcrichton alexcrichton merged commit 3f4acc4 into rustwasm:master Mar 2, 2020
@Pauan Pauan deleted the faster-web-sys branch March 2, 2020 23:43
@chinedufn
Copy link
Contributor

Sweet!
Mind cutting a new web-sys release since this change allows auto complete in most editors / IDEs / etc to work?

@alexcrichton
Copy link
Contributor

Certainly!

It appears I've waited far too long for a release, this is a huge release!

@chinedufn
Copy link
Contributor

Woo!!

@panstromek
Copy link

panstromek commented Jun 6, 2020

Sadly, this PR broke analysis and completion for web_sys in intellij-rust, right at a time when they started supporting items generated by build scripts. So when analysis for web_sys finally started working, which was a huge improvement, it broke again in just a few weeks.

I feel bad for saying this after reading the discussion here, but the original version of this PR that generated the expanded code directly would be much more prefearable for IDEs like IntelliJ because it would just work out of the box.

On the other hand, the final version generates #[wasm_bindgen] annotations which is better for compatibility and maintenance (as far I can tell from the discussion), but it's much more challenging for editors, because it requires some level of support for proc-macros, which is a huge and problematic task that basically no editor fully supports at the moment and it probably won't happen soon (only RA has some basic support, but I am not sure how far they got).

For this reason, lot of users of intellij-rust actually pin web_sys version to the older one which still generates bindings with build script, because the IDE support makes a huge difference. It's worth noting that wasm is one of the big entrypoints to rust ecosystem, so having good editor support for it is definitely a good feature to have for newcomers.

I'm not sure what to do about it, but I felt like it's at least worth pointing that out. You might consider generating some more "ide-friendly" code, but I can see that this approach has some major advantages so leaving it as is definitely understandable (and proc-macro support in IDEs will need to happen at some point anyway).

If you want to see specifically what is problematic, I wrote it in the comment on releveant intellij-rust issue - intellij-rust/intellij-rust#5104 (comment) .

@Pauan
Copy link
Contributor Author

Pauan commented Jun 6, 2020

@panstromek Increasing the build time from 10 seconds to 4:17 minutes just to get IDE support is not worth it. The build times are a much bigger showstopper.

Proc macros are extremely common in Rust, they are used constantly and pervasively. Many libraries (including wasm-bindgen) require proc macros. Asking libraries to not use proc macros is not going to work, instead the IDEs need to work on adding support for proc macros.

Also, even if we did revert this PR, that would only fix web-sys: IDE support would still be broken with every other crate (including js-sys), because they all use the #[wasm_bindgen] macro (which is necessary).

Your link doesn't seem to have anything to do with web-sys, that seems like a more general problem with IDEs + wasm-bindgen, so you should file a new issue about that.

@chinedufn
Copy link
Contributor

chinedufn commented Jun 6, 2020

I follow intellij-rust releases closely and write a good bit of Rust + wasm and I’ve never had auto complete work neither before nor after this PR landed.
Even after intellij-rust got out dir support.

@panstromek are you positive that this ever worked in the first place? I noticed that in the linked issue you expressed that you weren’t certain.

Just asking because I distinctly remember being excited about out dir support and it not working with web_sys (as well as most other libraries).

I had the setting for it enabled.

It could’ve just been something that I faced though and maybe it was working for others.

@panstromek
Copy link

panstromek commented Jun 6, 2020

@panstromek Increasing the build time from 10 seconds to 4:17 minutes just to get IDE support is not worth it.

That's definitely not what I had in mind. I wouldn't want to revert this PR, because the win is huge. If anything, I would only advocate for using the first version of it that generated the expanded code directly without #[wasm_bindgen] annotations (and from the comments above it compiled even faster) or something along those lines.

Apart from that, it is not universaly true, because you pay this compile time cost only once for initial build, while IDE support is something you use all the time, so there's a clear tradeoff that some people are willing to make.

Proc macros are extremely common in Rust, they are used constantly and pervasively. Many libraries (including wasm-bindgen) require proc macros. Asking libraries to not use proc macros is not going to work, instead the IDEs need to work on adding support for proc macros.

Yes, but proper IDE support is still years away (and some of it is even impossible). Until then, I don't see anything wrong with being "IDE-friendly", especially in cases like this, where proc macro is more a convenience then necessity. But as I said, it's totally understandable to leave it as is.

Also, there is a difference - some proc-macros are much more IDE-friendly than others, especially ones that just generate internal code or glue, so in practice most commonly used proc macro crates work just fine. #[wasm_bindgen] is problematic because it takes "public" syntax and transforms it into some different, incompatible "public" syntax (instead of just sugaring it, like serde, rocket or most derive macros do, for example). Without the annotations, the code doesn't even compile and IDE has no chance of inferring the final structure without fully supporting proc macros.

Also, even if we did revert this PR, that would only fix web-sys: IDE support would still be broken with every other crate (including js-sys), because they all use the #[wasm_bindgen] macro (which is necessary).

This is true, the problem is more general than just web-sys but in practice, just web-sys alone makes a huge difference. The other use cases are much less common. js_sys is relevant but not as much, and defining your own extern types is something I basically never do.

@chinedufn It worked for me once, even though I had to do little bit of tuning that I described here: intellij-rust/intellij-rust#3066 (comment) It could autocomplete and resolve to the generated file. Note that I had older version of web-sys on that project. After I updated web-sys, it stopped working again.

@Pauan
Copy link
Contributor Author

Pauan commented Jun 6, 2020

@panstromek I would only advocate for using the first version of it that generated the expanded code directly without #[wasm_bindgen] annotations (and from the comments above it compiled even faster) or something along those lines.

That doesn't work, because it will cause version conflicts (which is why I changed it to use #[wasm_bindgen]).

So the only two choices are the 4:17 minute build time version or the #[wasm_bindgen] version.

Until then, I don't see anything wrong with being "IDE-friendly", especially in cases like this, where proc macro is more a convenience then necessity.

It is not a convenience, it is a necessity. The #[wasm_bindgen] macro is fundamental and necessary to the entirety of wasm-bindgen.

#[wasm_bindgen] is problematic because it takes "public" syntax and transforms it into some different, incompatible "public" syntax

Like I said, that is a more general problem than web-sys, so you should file a new issue about that.

@panstromek
Copy link

That doesn't work, because it will cause version conflicts (which is why I changed it to use #[wasm_bindgen]).
So the only two choices are the 4:17 minute build time version or the #[wasm_bindgen] version.

I wrote these comments under assumption that this is not the case, because it didn't follow from the discussion above. Quoting Alex:

I think this may be solvable, though, by using an =A.B.C dependency from web-sys to wasm-bindgen. The web-sys generated code will only be compatible with one version of wasm-bindgen, and the = dependency should take care of that. I also don't think this will cause too many practical issues in the ecosystem.


It is not a convenience, it is a necessity. The #[wasm_bindgen] macro is fundamental and
necessary to the entirety of wasm-bindgen.

I specifically meant web-sys case, not #[wasm_bindgen] in general. If the above holds, the macro usage is not strictly necessary.

Like I said, that is a more general problem than web-sys, so you should file a new issue about that.

There is already an issue about general proc macro support, this is no different.


Overall, I don't understand why you have such a dismissive attitude about it. I don't think I wrote anything controversial and I think that IDE support is reasonable concern to have, especially considering such a hard feature to support like proc macros.

I commented here because it seemed like something that could be enabled on the wasm-bindgen side with way less effort than on the editor side, where it requires big part of proc macro support and that just won't happen anytime soon, like maybe for few another years ( and full #[wasm_bindgen] macro is impossible to support in all cases), so I thought it's worth thinking about it here. It's a recurring topic and it would certainly help a lot of people, especially newcomers.

If this is not feasible, then fine - but that wasn't immediately obvious from the thread and I don't think there is a reason to be so negative about it.

@Pauan
Copy link
Contributor Author

Pauan commented Jun 6, 2020

@panstromek What you are asking for is unreasonable. You are asking for very large changes which will hurt maintainability, lock the version of wasm-bindgen, make the docs far worse, and undo all of the major work which was done to move to #[wasm_bindgen] (which is much better than compiling to expanded code). And the only benefit is that web-sys will have better IDE support (but not the rest of wasm-bindgen).

I'm not being "dismissive" or "negative", I am explaining to you in a straightforward and matter of fact way why the changes you are asking for won't work. Yes IDE support is important, but not if it significantly hurts other areas. There are trade-offs here, this isn't just a simple "oh we can easily support IDE" situation.

If there is something about the #[wasm_bindgen] macro which we can change to make it easier for IDEs to support it, I think we should discuss that (in a new issue).

@Pauan
Copy link
Contributor Author

Pauan commented Jun 6, 2020

@panstromek To put it into perspective, you are essentially asking for a complete rewrite of the web-sys code generation, and it also requires changes to the wasm-bindgen internal ABI. It is not some small change where we can just revert a couple things. There's a reason this (very large) PR required 9 days of hard non-stop work to implement.

@panstromek
Copy link

Well, if you put it like that, then ok. From what I can tell, it seemed like just reviving back something you already had implemented before (that's how it looks like from the first few comments, along with PR description).

And I beg to differ, you didn't actually explain anything, you just said hard "no" and gave information that contradicts other comments at the begining of this thread, which didn't help at all in understanding why this would be a problem. It's clearer now after your last comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants