Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix popcnt for aarch64 #3211

Merged
merged 4 commits into from
Oct 5, 2022
Merged

Conversation

grishasobol
Copy link
Contributor

Description

Fix problem with infinite popcnt WASM instruction execution on M1 processor. See details in corresponded issue: #3210

Review

  1. Make fix for i32.popcnt. Use additional execution branch if src register equal to 1:
if src == 0 => goto exit_label
if src == 1 => goto src_is_one_label

loop_label:
... // popcnt loop body

goto exit_label

src_is_one_label:
mov dest, 1

exit_label:

@ptitSeb
Copy link
Contributor

ptitSeb commented Oct 5, 2022

Thank you for the bug report, the detailled analysis, and the PR :)

I have checked ARM64 documentation, and in fact, the LSL will behave like you describe on all Aarch64 platform, not only M1 cpu, so you can remove the Apple specific compile tag.
The doc list:

result = ShiftReg(n, shift_type, UInt(operand2) MOD datasize);

so we see the modulo datasize for the shift value.

As for the fix, I would prefer some more simple, like changing

        self.assembler.emit_clz(Size::S32, src, tmp)?; // clz src => tmp
        self.assembler
            .emit_add(Size::S32, tmp, Location::Imm8(1), tmp)?; // inc tmp
        self.assembler.emit_lsl(Size::S32, src, tmp, src)?; // src << tmp => src
        self.assembler.emit_cbnz_label(Size::S32, src, label_loop)?; // if src!=0 goto loop

to

        self.assembler.emit_clz(Size::S32, src, tmp)?; // clz src => tmp
        self.assembler.emit_lsl(Size::S32, src, tmp, src)?; // src << tmp => src
        self.assembler.emit_lsl(Size::S32, src, Location::Imm8(1), src)?; // src << 1 => src
        self.assembler.emit_cbnz_label(Size::S32, src, label_loop)?; // if src!=0 goto loop

for the 32bits popcnt, to workaround the 31+1=32 issue.

I tested this with your test on a aarhc64/linux box, and the change is both needed and working.

@grishasobol
Copy link
Contributor Author

Sure)) your solution is much better. Will do it

@grishasobol grishasobol changed the title fix popcnt for m1 and add test fix popcnt for aarch64 Oct 5, 2022
@ptitSeb
Copy link
Contributor

ptitSeb commented Oct 5, 2022

bors r+

bors bot added a commit that referenced this pull request Oct 5, 2022
3211: fix popcnt for aarch64 r=ptitSeb a=grishasobol

# Description
Fix problem with infinite `popcnt` WASM instruction execution on M1 processor. See details in corresponded issue: #3210

# Review
1) Make fix for `i32.popcnt`. Use additional execution branch if `src` register equal to `1`:
```
if src == 0 => goto exit_label
if src == 1 => goto src_is_one_label

loop_label:
... // popcnt loop body

goto exit_label

src_is_one_label:
mov dest, 1

exit_label:
```


Co-authored-by: Gregory Sobol <grishasobol@mail.ru>
@bors
Copy link
Contributor

bors bot commented Oct 5, 2022

Build failed:

@ptitSeb
Copy link
Contributor

ptitSeb commented Oct 5, 2022

@grishasobol can you please run cargo fmt --all and push the diff, the lint is failing, sorry.

@grishasobol
Copy link
Contributor Author

@grishasobol can you please run cargo fmt --all and push the diff, the lint is failing, sorry.

Done

@grishasobol grishasobol requested review from ptitSeb and removed request for syrusakbary October 5, 2022 16:02
@ptitSeb
Copy link
Contributor

ptitSeb commented Oct 5, 2022

bors r+

@bors
Copy link
Contributor

bors bot commented Oct 5, 2022

Build succeeded:

@bors bors bot merged commit 4ad7020 into wasmerio:master Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants