Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent crash using ThreadRng::gen() in webkit based browser with wasm-bindgen #1016

Closed
kellpossible opened this issue Aug 15, 2020 · 17 comments

Comments

@kellpossible
Copy link

kellpossible commented Aug 15, 2020

Common issues

I'm getting an intermittent Unhandled Promise Rejection crash with my code compiled using wasm-bindgen, via wasm-pack using the web target, and the --dev profile. This code is running in the webkit2gtk 2.28.4-1 browser on Archlinux.

I'm using rand version 0.7.3 and the wasm-bindgen feature enabled.

The line of code that is triggering this error is:

rand::thread_rng().gen()

The backtrace for the error within rand:

[Error] Unhandled Promise Rejection: Error: Out of bounds memory access (near '...e__h0d03c6547e5e0185(arg0, arg1, addHeap...')
	<?>.wasm-function[<T as core::convert::Into<U>>::into::h8e2dc1dc8cfddf13] (data:text/html,%3C…%3E:271:135)
	<?>.wasm-function[ppv_lite86::generic::dmap2::h9a1309b08a243710]
	<?>.wasm-function[rand_chacha::guts::refill_wide::ha379d3985d906d93]
	<?>.wasm-function[<rand_chacha::chacha::ChaCha20Core as rand_core::block::BlockRngCore>::generate::h228564273ae9a3c9]
	<?>.wasm-function[<rand::rngs::adapter::reseeding::ReseedingCore<R,Rsdr> as rand_core::block::BlockRngCore>::generate::h6885e5a5fa188921]
	<?>.wasm-function[rand_core::block::BlockRng<R>::generate_and_set::h6505ddccc6a5f801]
	<?>.wasm-function[<rand_core::block::BlockRng<R> as rand_core::RngCore>::next_u32::h8b977afb2b63748d]
	<?>.wasm-function[rand::distributions::integer::<impl rand::distributions::Distribution<u32> for rand::distributions::Standard>::sample::hc3ae3caece8847d4]
	<?>.wasm-function[rand::Rng::gen::hb1a7fdc6a5707799]
kellpossible added a commit to ProvableHQ/yew_webview_bridge that referenced this issue Aug 15, 2020
@kellpossible
Copy link
Author

kellpossible commented Aug 15, 2020

Perhaps it's a problem in this crate? https://github.com/cryptocorrosion/cryptocorrosion/blob/master/utils-simd/ppv-lite86/src/generic.rs#L106

Should the ppv-lite86 even be used in the wasm32-unknown-unknown context? Itself being described as: Implementation of the crypto-simd API for x86.

@bjorn3
Copy link
Contributor

bjorn3 commented Aug 15, 2020

generic.rs is the implementation for architectures other than x86, or when simd is not enabled.

@dhardy
Copy link
Member

dhardy commented Aug 19, 2020

Reading the code tells me that dmap2::T == u32x4_generic and dmap2::F == u32::wrapping_add. The .into method is therefore this one, but looks to me like that should be a no-op (all uses are as a [u32; 4]):

https://github.com/cryptocorrosion/cryptocorrosion/blob/8bd2124d05c774febb90a6f0cc1b6bb875668037/utils-simd/ppv-lite86/src/generic.rs#L447

Perhaps WASM doesn't handle the vec128_storage union correctly? It includes a u128 which might force different alignment — without #[repr(C)] the internal representation is actually unspecified.

Can you try adding #[repr(C)] to vec128_storage and see if that fixes it?

@dhardy
Copy link
Member

dhardy commented Aug 19, 2020

@kazcw
Copy link
Contributor

kazcw commented Aug 19, 2020

Perhaps WASM doesn't handle the vec128_storage union correctly? It includes a u128 which might force different alignment — without #[repr(C)] the internal representation is actually unspecified.

@dhardy I bet this is it. The behavior of this union is well-defined (size should be same as u128 per the reference you linked--"size of a union is determined by the size of its largest field", alignment should be same as u128 per the nomicon--"composite structures have an alignment equal to the maximum of their fields' alignments"), but this wouldn't be the first time I've needed to work around a bug in wasm32's u128 handling. I see no reason not to use repr(C); it's defined to have the same size and alignment as repr(Rust) in this case, and may avoid the problem. I'll get an environment set up to reproduce it and then see if wasm32 does any better with the other repr.

kazcw added a commit to kazcw/rand1016 that referenced this issue Aug 19, 2020
@kazcw
Copy link
Contributor

kazcw commented Aug 19, 2020

Using #[repr(C)] on the union doesn't seem to make any difference.

@dhardy
Copy link
Member

dhardy commented Aug 19, 2020

Using #[repr(C)] on the union doesn't seem to make any difference.

Not surprising. Nevertheless, relying on that is relying on UB (from my understanding), so we ought to fix it.

That leaves a WASM bug?

@kazcw
Copy link
Contributor

kazcw commented Aug 19, 2020

relying on that is relying on UB (from my understanding), so we ought to fix it.

I don't see how this would be the case. If I were transmuting the union itself, I would agree that repr(Rust) is dubious. But I'm only accessing members, and the behavior of doing that is well-defined in the docs. This is the basic use case of unions and if it were UB under repr(Rust), unions would be required to be repr(C).

That leaves a WASM bug?

Since the error occurs when initializing the union, and there's no wrong way to do that, I think it has to be a WASM bug. But we'll probably want a workaround for users of toolchains older than whenever a fix comes out. I'll fix mod generic to avoid unions with u128 members.

@dhardy
Copy link
Member

dhardy commented Aug 19, 2020

he behavior of doing that is well-defined in the docs

No, and the docs are quite explicit:

Fields might have a non-zero offset (except when #[repr(C)] is used)

Effectively, writing to and then reading from a #[repr(C)] union is analogous to a transmute from the type used for writing to the type used for reading.

It's not the only use-case: the other one is untagged enums.

@kazcw
Copy link
Contributor

kazcw commented Aug 19, 2020

I don't think it's possible for the compiler to use any but the obvious layout for this union without violating the documented properties of repr(Rust). While a C compiler would skin a programmer for using that kind of logic, it's my impression that to the Rustacean, no line of reasoning is verboten. But even if I'm right about both those things, the correctness of repr(C) is more obvious, so I'll change it while I'm reimplementing vec128_storage to suit wasm32's capabilities.

@kazcw
Copy link
Contributor

kazcw commented Aug 19, 2020

Hm, removing the u128 member from the union didn't solve the problem. It must have only seemed to fix it because the crash is non-deterministic and perturbing the compilation resulted in a different outcome. With u128 (and repr) issues ruled out, I can't see why instantiating this simple union would crash. Currently combing through the WASM disassembly in the backtrace.

@kazcw
Copy link
Contributor

kazcw commented Aug 19, 2020

Actually, I think after that change I'm hitting a different problem. It's not a bad memory access now, it's entering an abort branch when a read from memory returns 1. It fits the pattern of something like a mutex acquisition failure, so I'm wondering if it has to do with the way the ThreadRng synchronization code is compiling to the single-threaded wasm environment. I haven't been able to get function names or any other source map info in firefox, which would make things clearer. I'll try it under webkit2gtk.

@dhardy
Copy link
Member

dhardy commented Aug 20, 2020

documented properties of repr(Rust)

What are these?

@dhardy
Copy link
Member

dhardy commented Aug 20, 2020

ThreadRng synchronization

There isn't any as such, but there is a thread_local for initialisation. But it looks like this uses an unsynchronised mechanism.

@kellpossible
Copy link
Author

kellpossible commented Sep 19, 2020

Every day I'm thinking that it's more and more likely that this is a problem specific to libwebkit2gtk (some bug in their wasm implementation) and this issue can probably be closed.

@dhardy
Copy link
Member

dhardy commented Sep 19, 2020

Since we have adjusted ThreadRng in #1035 it would be interesting to know whether the issue persists on the master branch (or 0.8 when it releases).

@kellpossible
Copy link
Author

I've not had the time to create a reproducible example that I can share, this issue appears to be exclusively occurring on libwebkit2gtk, perhaps it's a bug with the webassembly implementation there. After several weeks running my application in chomium and firefox I've yet to reproduce the issue in those browsers. I've also had some rendering issues exclusive to libwebkit2gtk so probably will ditch it at this point and use some other solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants