Parser improvements #5

fabi321 · 2023-06-15T14:21:04Z

Multiple parser improvements,both on the readability, as well as performance side.

The new SIMD hex parser has one major drawback: it is undefined behavior for invalid characters, instead of 0.

fabi321 · 2023-06-15T14:26:06Z

The alpha blending could probably benefit from SIMD usage, too, but I'm kinda waiting for benches that include alpha for this

reduces the time from 13.5ms to 11ms (~-20%), and breaks the benchmark

sbernauer · 2023-06-15T19:09:17Z

Really awesome, I like!

I think the UB is fine, when the user provides garbage values it's ok to color the pixel with some random value I guess.

I want to write some notes for README (e.g. that we need to use nightly rust) before merging this.
Had some trouble getting reproducible benchmark results, do you have speedup numbers from your machine?
I would also be interested in the emitted assembler code using cargo asm --rust breakwater::parser::simd_unhex, but it can't find the specified function besides all my effort of #[inline(never)] and calling simd_unhex directly from main.

sbernauer · 2023-06-15T19:11:49Z

Nice time-travel btw!

fabi321 · 2023-06-15T20:18:38Z

As I mentioned in one of the commits, the speedup on my machine was, averaging across many runs, from 13.5ms to 11ms. My bench results have been pretty stable at 11.05±0.1ms

fabi321 · 2023-06-15T20:26:01Z

I also think that UB is fine, that's why I used in the first place, it's just something that is different now due to the way how the conversion happens, as it is no longer a lookup, but rather a calculation that is only defined for 0123456789abcdefABCDEF

sbernauer

I'm happy to merge this PR as-is, but have made some changes to the Rust toolchain and README here.
I would leave it up to you if you want pull in the 3 commits from https://github.com/sbernauer/breakwater/tree/parser-improvements, or if I should push them afterwards

fabi321 · 2023-06-15T20:35:07Z

Regarding the assembler, this is the function as IDA disassebles it:

u32 __fastcall breakwater::parser::simd_unhex::h52042561b542debc(__u8_ value, __m128 _XMM0)
{
  u32 result; // eax
  core::option::Option<core::fmt::Arguments> args; // [rsp+0h] [rbp-38h] BYREF

  *(_QWORD *)args.gap0 = value.length;
  if ( value.length != 8 )
  {
    *(_QWORD *)&args.gap0[8] = 0LL;
    core::panicking::assert_failed::h676f2fb9f137bf56(Eq, (usize *)&args, (usize *)"\b", args);
  }
  __asm
  {
    vpmovzxbd ymm0, qword ptr [rdi]
    vpbroadcastd ymm2, cs:dword_2EE7B4
    vpsrld  ymm1, ymm0, 6
    vpmaddwd ymm1, ymm1, cs:ymmword_2EE7C0
    vpand   ymm0, ymm0, ymm2
    vpaddd  ymm0, ymm1, ymm0
    vpsllvd ymm0, ymm0, cs:ymmword_2EE7E0
    vextracti128 xmm1, ymm0, 1
    vpor    xmm0, xmm0, xmm1
    vpshufd xmm1, xmm0, 0EEh
    vpor    xmm0, xmm0, xmm1
    vpshufd xmm1, xmm0, 55h ; 'U'
    vpor    xmm0, xmm0, xmm1
    vmovd   eax, xmm0
    vzeroupper
  }
  return result;
}

fabi321 · 2023-06-15T20:41:58Z

@sbernauer I pulled your commits, and it is ready to be merged.

fabi321 · 2023-06-15T20:45:12Z

One note about parsing coordinates using simd: as they are variable length, it will be hard if not impossible to do. That was one of the reasons why I picked colors as my first target, as their length is known in the code path.

sbernauer · 2023-06-16T06:37:16Z

That were exactly my thoughts! Maybe we can use some bitmasks to determine the coordinate length and afterwards use specialized SIMD-instructions. Event cooler would be to parse both coordinates simultaneous, but we are getting ahead of ourselves ^^

sbernauer · 2023-06-16T06:37:31Z

Thanks for putting this up!

sbernauer · 2023-06-16T06:53:00Z

Have to come back this before merging to get the ci checks green:

Switch ci to use nightly rust.
Switch the docker build to not use cpu-specific features as it should run everywhere

Will try to get it done today

fabi321 · 2023-06-16T14:52:03Z

I have played around with SIMD digit parsing now, in a separate branch (https://github.com/fabi321/breakwater/tree/simd-digit-parsing). However, it is significantly slower as of now. it takes about 20.5ms. I also tried to confuse the branch predictor a bit, shuffling the commands, but that puts the current solution at 18.75ms, still faster than my simd one. So either, I find a way to fix the performance issues, or it will have to stay like this.

sbernauer · 2023-06-16T17:56:10Z

Sounds really interesting, I think I will give it a try as well. I now also installed Linux on (~10 year old) Desktop to get more reliable benchmarks.
I would merge this PR now and fix the CI checks in main (easier to test) We can improve things in a new PR I would say

sbernauer · 2023-06-16T18:00:55Z

Btw, had a -8.2877% improvement (27.359ms vs 25.092ms) on my Desktop 👍 (which sadly only support avx, no avx2 or above)

fabi321 added 5 commits June 15, 2023 16:28

remove faster-hex as I didn't end up using it

d3da855

fixed alpha parsing

e99b81e

More readable command parsing

737a4b1

move OFFSET as it is likely used more often than HELP and SIZe

132b3b5

Throwing some SIMD at it

19326ed

reduces the time from 13.5ms to 11ms (~-20%), and breaks the benchmark

fabi321 and others added 5 commits June 15, 2023 21:33

Remove old lookup method

1c13295

added remark about undefined behavior

6e6d9f8

Set toolchain to nightly

ad9821f

docs: Change link to permalink

173ae60

docs: Document SIMD and nightly rust in README

ae5b03f

sbernauer approved these changes Jun 15, 2023

View reviewed changes

sbernauer merged commit 61a1654 into sbernauer:master Jun 16, 2023

fabi321 deleted the parser-improvements branch June 16, 2023 21:29

fabi321 restored the parser-improvements branch June 16, 2023 21:41

fabi321 deleted the parser-improvements branch June 20, 2023 21:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser improvements #5

Parser improvements #5

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

sbernauer commented Jun 15, 2023

sbernauer commented Jun 15, 2023

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

sbernauer left a comment

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

sbernauer commented Jun 16, 2023

sbernauer commented Jun 16, 2023

sbernauer commented Jun 16, 2023

fabi321 commented Jun 16, 2023

sbernauer commented Jun 16, 2023

sbernauer commented Jun 16, 2023

Parser improvements #5

Parser improvements #5

Conversation

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

sbernauer commented Jun 15, 2023

sbernauer commented Jun 15, 2023

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

sbernauer left a comment

Choose a reason for hiding this comment

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

fabi321 commented Jun 15, 2023

sbernauer commented Jun 16, 2023

sbernauer commented Jun 16, 2023

sbernauer commented Jun 16, 2023

fabi321 commented Jun 16, 2023

sbernauer commented Jun 16, 2023

sbernauer commented Jun 16, 2023