Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threaded rayon #246

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Threaded rayon #246

wants to merge 2 commits into from

Commits on May 14, 2022

  1. Run aarch64 tests with --release

    These are being emulated, with instruction count adding significantly to
    the runtime. As it stands this is >20 minutes for a full run which is
    the bottleneck of our CI times. It should not be necessary to run these
    with debug flags. The main reason to have the at all, the arch/aarch64
    inline SIMD code, is not affected by flags in any case.
    HeroicKatora committed May 14, 2022
    Configuration menu
    Copy the full SHA
    fd80c1d View commit details
    Browse the repository at this point in the history
  2. Change parallelization strategy in rayon

    Intends to address the issue of effectively serialized sort, where all
    tasks end up being executed on the main thread instead of being
    distributed into other workers.
    
    We had neglected that most work is scheduled in sync (apppend_row such
    as in decoder.rs:903 instead of apppend_rows). This meant most were
    executed with an immediate strategy.
    
    The change pushes all items into a bounded task queue that is emptied
    and actively worked on when it reaches a capacity maximum, as well as
    when any component result is requested. This is in contrast to
    std::multithreading where items are worked on while decoding is in
    progress but task queueing itself has more overhead.
    
    decode a 512x512 JPEG   time:   [1.7317 ms 1.7352 ms 1.7388 ms]
                            change: [-22.895% -22.646% -22.351%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 6 outliers among 100 measurements (6.00%)
      1 (1.00%) low mild
      4 (4.00%) high mild
      1 (1.00%) high severe
    
    decode a 512x512 progressive JPEG
                            time:   [4.7252 ms 4.7364 ms 4.7491 ms]
                            change: [-15.641% -15.349% -15.052%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 3 outliers among 100 measurements (3.00%)
      1 (1.00%) high mild
      2 (2.00%) high severe
    
    decode a 512x512 grayscale JPEG
                            time:   [873.48 us 877.71 us 882.83 us]
                            change: [-11.470% -10.764% -10.041%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 13 outliers among 100 measurements (13.00%)
      2 (2.00%) low mild
      9 (9.00%) high mild
      2 (2.00%) high severe
    
    extract metadata from an image
                            time:   [1.1033 us 1.1066 us 1.1099 us]
                            change: [-11.608% -9.8026% -8.3965%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 4 outliers among 100 measurements (4.00%)
      2 (2.00%) low severe
      1 (1.00%) low mild
      1 (1.00%) high mild
    
    Benchmarking decode a 3072x2048 RGB Lossless JPEG: Warming up for 3.0000 s
    Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 36.6s, or reduce sample count to 10.
    decode a 3072x2048 RGB Lossless JPEG
                            time:   [363.07 ms 363.66 ms 364.27 ms]
                            change: [+0.0997% +0.3692% +0.6323%] (p = 0.01 < 0.05)
                            Change within noise threshold.
    Found 6 outliers among 100 measurements (6.00%)
      5 (5.00%) high mild
      1 (1.00%) high severe
    
         Running unittests (target/release/deps/large_image-0e61f2c2f07410bd)
    Gnuplot not found, using plotters backend
    decode a 2268x1512 JPEG time:   [28.755 ms 28.879 ms 29.021 ms]
                            change: [-5.7714% -4.9308% -4.0969%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 7 outliers among 100 measurements (7.00%)
      3 (3.00%) high mild
      4 (4.00%) high severe
    HeroicKatora committed May 14, 2022
    Configuration menu
    Copy the full SHA
    c8323b7 View commit details
    Browse the repository at this point in the history