Replace temporary lowering of Wasm's `load_splat` with a new Cranelift instruction #1175

abrown · 2020-01-14T23:29:40Z

What is the feature or code improvement you would like to do in Cranelift?

In bytecodealliance/cranelift#1347 I added a temporary lowering for Wasm's load_splat to two Cranelift instructions, load + splat. This generates extra instructions that could be removed by a specialized Cranelift load_splat instruction or by smarter codegen (e.g. complex addressing on splat).

What is the value of adding this in Cranelift?

Fewer instructions produced.

Do you have an implementation plan, and/or ideas for data structures or algorithms to use?

Seeking feedback on which way to proceed: specialized load_splat or smarter codegen.

Have you considered alternative implementations? If so, how are they better or worse than your proposal?

See above.

The text was updated successfully, but these errors were encountered:

abrown · 2020-01-14T23:35:13Z

@sunfishcode, I wonder if the load-from-constant-table questions I was asking also relate to this (in regard to the complex addressing).

akirilov-arm · 2020-07-24T21:48:14Z

Having a specialized load_splat Cranelift instruction would be beneficial for the AArch64 backend (so that it could generate LD1R instructions). I tried the alternative approach quickly, i.e. smarter codegen, which in the AArch64 case means using maybe_input_insn(), but it didn't work (I was getting None as the return value) - the reason was probably because the load might have side effects.

abrown · 2020-07-24T21:59:45Z

Let's do that then; in the past, it's always been ok to create Cranelift instructions that map exactly to Wasm SIMD instructions so that seems like a good path forward. If you add it (somewhere in meta/src/shared/instructions.rs) feel free to tag me and I'll review it.

akirilov-arm · 2020-07-27T22:31:28Z

I'd be interested in having a wider discussion; in particular, @cfallin might have a different opinion with respect to the AArch64 backend.

cfallin · 2020-07-27T22:49:01Z

I think it's a reasonably pragmatic solution to have the dedicated instruction; it reduces work overall in the pipeline if we can carry the whole package of operations together as one unit from Wasm to machine code.

There's a possible concern if we have a lot of these; it's certainly nice to only have to worry about literal "load" and "store" instructions when analyzing loads and stores. But perhaps we get around that by having accessors on the instruction ("this inst is a load of some form, with address X and size Y"). Or, in the worst case, the instruction is a black-box with unknown side-effects from the point of view of other optimization passes, so it inhibits some optimization but is still correct.

@akirilov-arm, the reason that your (quite reasonable!) approach attempting to merge the ops in the backend didn't work is indeed because load is side-effecting, so will never be returned by maybe_input_insn(). This prevents the backend from arbitrarily sinking a side-effecting op, which could lead to a miscompile (consider if the load-splat were sunk past a store to an aliasing pointer, for example). The "correct" thing to do is to hoist the splat instead and then merge, but the framework isn't really set up to allow that.

alexcrichton transferred this issue from bytecodealliance/cranelift Feb 28, 2020

alexcrichton added the cranelift Issues related to the Cranelift code generator label Feb 28, 2020

abrown mentioned this issue Apr 2, 2020

OOB memory access #1407

Closed

akirilov-arm mentioned this issue Oct 7, 2020

Introduce the Cranelift IR instruction LoadSplat #2278

Merged

cfallin closed this as completed in #2278 Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace temporary lowering of Wasm's `load_splat` with a new Cranelift instruction #1175

Replace temporary lowering of Wasm's `load_splat` with a new Cranelift instruction #1175

abrown commented Jan 14, 2020

abrown commented Jan 14, 2020

akirilov-arm commented Jul 24, 2020

abrown commented Jul 24, 2020

akirilov-arm commented Jul 27, 2020

cfallin commented Jul 27, 2020

Replace temporary lowering of Wasm's load_splat with a new Cranelift instruction #1175

Replace temporary lowering of Wasm's load_splat with a new Cranelift instruction #1175

Comments

abrown commented Jan 14, 2020

What is the feature or code improvement you would like to do in Cranelift?

What is the value of adding this in Cranelift?

Do you have an implementation plan, and/or ideas for data structures or algorithms to use?

Have you considered alternative implementations? If so, how are they better or worse than your proposal?

abrown commented Jan 14, 2020

akirilov-arm commented Jul 24, 2020

abrown commented Jul 24, 2020

akirilov-arm commented Jul 27, 2020

cfallin commented Jul 27, 2020

Replace temporary lowering of Wasm's `load_splat` with a new Cranelift instruction #1175

Replace temporary lowering of Wasm's `load_splat` with a new Cranelift instruction #1175