Benchmark for addressing tiny elementwise dispatch issues #10216

hanhanW · 2022-08-25T22:40:57Z

No description provided.

iree-github-actions-bot · 2022-08-25T23:31:02Z

Abbreviated Benchmark Summary

@ commit 3956a1ebdef8291accd5dcce93e8ee31a92206d2 (vs. base acb7355b1e80495fc63dd8d2358ca5b821f5eb8a)

Regressed Latencies 🚩

Benchmark Name	Average Latency (ms)	Median Latency (ms)	Latency Standard Deviation (ms)
MobileBertSquad [fp16] (TFLite) full-inference,experimental-flags with IREE-Vulkan @ Pixel-6-Pro (GPU-Mali-G78)	154.318 (vs. 130.492, 18.26%↑)	154.731	3.240
MobileNetV3Small [fp32,imagenet] (TFLite) 4-thread,big-core,full-inference,experimental-flags with IREE-LLVM-CPU @ Pixel-4 (CPU-ARMv8.2-A)	10.870 (vs. 9.638, 12.78%↑)	10.873	0.028
MobileBertSquad [fp16] (TFLite) full-inference,default-flags with IREE-Vulkan @ Pixel-6-Pro (GPU-Mali-G78)	139.878 (vs. 130.053, 7.55%↑)	138.348	5.972

[Top 3 out of 6 results showed]

Improved Latencies 🎉

Benchmark Name	Average Latency (ms)	Median Latency (ms)	Latency Standard Deviation (ms)
MobileBertSquad [int8] (TFLite) 4-thread,big-core,full-inference,experimental-flags with IREE-LLVM-CPU @ Pixel-4 (CPU-ARMv8.2-A)	199.307 (vs. 236.307, 15.66%↓)	199.285	0.406
MobileBertSquad [fp32] (TFLite) 4-thread,big-core,full-inference,experimental-flags with IREE-LLVM-CPU @ Pixel-6-Pro (CPU-ARMv8.2-A)	312.317 (vs. 370.020, 15.59%↓)	297.569	37.892
MobileNetV2 [fp32,imagenet] (TFLite) big-core,full-inference,default-flags with IREE-LLVM-CPU-Sync @ Pixel-4 (CPU-ARMv8.2-A)	38.029 (vs. 42.592, 10.71%↓)	35.570	3.277

[Top 3 out of 9 results showed]

No improved or regressed compilation metrics 🏖️

For more information:

iree-github-actions-bot · 2022-08-26T02:23:43Z

Abbreviated Linux Benchmark Summary

@ commit 3956a1ebdef8291accd5dcce93e8ee31a92206d2 (vs. base 094ec6d183e769893ae6e9ee98e668712a19dd24)

Improved Latencies 🎉

Benchmark Name	Average Latency (ms)	Median Latency (ms)	Latency Standard Deviation (ms)
PoseNet [fp32] (TFLite) 8-thread,full-inference,default-flags with IREE-LLVM-CPU @ GCP-c2-standard-16 (CPU-x86\_64-CascadeLake)	6.268 (vs. 8.382, 25.22%↓)	6.277	0.034
PoseNet [fp32] (TFLite) 4-thread,full-inference,default-flags with IREE-LLVM-CPU @ GCP-c2-standard-16 (CPU-x86\_64-CascadeLake)	7.976 (vs. 10.196, 21.78%↓)	7.970	0.038
PersonDetect [int8] (TFLite) 8-thread,full-inference,default-flags with IREE-LLVM-CPU @ GCP-c2-standard-16 (CPU-x86\_64-CascadeLake)	1.647 (vs. 1.972, 16.47%↓)	1.647	0.004

[Top 3 out of 7 results showed]

Regressed Compilation Times 🚩

Benchmark Name	Compilation Time (ms)
MobileBertSquad [int8] (TFLite) CPU-RV32-Generic full-inference,default-flags	10999718 (vs. 1369401, 703.25%↑)
MobileBertSquad [int8] (TFLite) CPU-RV64-Generic full-inference,default-flags	395836 (vs. 146289, 170.58%↑)
EfficientNet [int8] (TFLite) CPU-RV32-Generic full-inference,default-flags	17652 (vs. 15420, 14.47%↑)

[Top 3 out of 14 results showed]

Improved Compilation Times 🎉

Benchmark Name	Compilation Time (ms)
PoseNet [fp32] (TFLite) CPU-x86\_64-CascadeLake 8-thread,full-inference,default-flags	8767 (vs. 10332, 15.15%↓)
PoseNet [fp32] (TFLite) CPU-x86\_64-CascadeLake full-inference,default-flags	8767 (vs. 10332, 15.15%↓)
PoseNet [fp32] (TFLite) CPU-x86\_64-CascadeLake 4-thread,full-inference,default-flags	8767 (vs. 10332, 15.15%↓)

[Top 3 out of 12 results showed]

Regressed Total Dispatch Sizes 🚩

Benchmark Name	Total Dispatch Size (bytes)
MobileBertSquad [int8] (TFLite) CPU-RV32-Generic full-inference,default-flags	84601208 (vs. 22515640, 275.74%↑)
MobileBertSquad [int8] (TFLite) CPU-RV64-Generic full-inference,default-flags	11305392 (vs. 4133184, 173.53%↑)
EfficientNet [int8] (TFLite) CPU-RV32-Generic full-inference,default-flags	749980 (vs. 610652, 22.82%↑)

[Top 3 out of 9 results showed]

For more information:

Benchmark for addressing tiny elementwise dispatch issues

3956a1e

hanhanW added (deprecated) buildkite:benchmark-android Deprecated. Please use benchmarks:android-* buildkite:benchmark-x86_64 labels Aug 25, 2022

hanhanW mentioned this pull request Aug 26, 2022

Optimize tiling sizes heuristics for elementwise dispatches. #10179

Merged

hanhanW closed this Aug 29, 2022

hanhanW deleted the tiny-elem-dis branch August 29, 2022 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark for addressing tiny elementwise dispatch issues #10216

Benchmark for addressing tiny elementwise dispatch issues #10216

hanhanW commented Aug 25, 2022

iree-github-actions-bot commented Aug 25, 2022 •

edited

Loading

iree-github-actions-bot commented Aug 26, 2022

Benchmark for addressing tiny elementwise dispatch issues #10216

Benchmark for addressing tiny elementwise dispatch issues #10216

Conversation

hanhanW commented Aug 25, 2022

iree-github-actions-bot commented Aug 25, 2022 • edited Loading

Abbreviated Benchmark Summary

Regressed Latencies 🚩

Improved Latencies 🎉

iree-github-actions-bot commented Aug 26, 2022

Abbreviated Linux Benchmark Summary

Improved Latencies 🎉

Regressed Compilation Times 🚩

Improved Compilation Times 🎉

Regressed Total Dispatch Sizes 🚩

iree-github-actions-bot commented Aug 25, 2022 •

edited

Loading