Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pad fusion bufferization workaround. #12425

Conversation

MaheshRavishankar
Copy link
Contributor

@MaheshRavishankar MaheshRavishankar commented Feb 28, 2023

It seems like handling the code generated by the tiling of pad operations needs more work in bufferization. To unblock the work of handling pad operations natively in IREE,
#11273 (comment) is implemented here as a workaround.

To ensure bufferization without allocation, yields of the then and else branch and the result of the scf.if are all tied together. If the then and else come from different bindings, then this would be illegal (because a copy is needed). This example led to adding more constraints on what sets can be merged during the
BufferizationAnalysis to avoid merging sets that have constants or have two different interface_bindings.

benchmarks: x86_64, cuda

It seems like handling the code generated by the tiling of pad
operations needs more work in bufferization. To unblock the work of
handling pad operations natively in IREE,
iree-org#11273 (comment)
is implemented here as a workaround.

To ensure bufferization without allocation, yields of the then and
else branch and the result of the `scf.if` are all tied together. If
the `then` and `else` come from different bindings, then this would be
illegal (because a copy is needed). This example led to adding more
constraints on what sets can be merged during the
`BufferizationAnalysis` to avoid merging sets that have constants or
have two different `interface_bindings`.
@MaheshRavishankar MaheshRavishankar added the (deprecated) buildkite:benchmark-android Deprecated. Please use benchmarks:android-* label Feb 28, 2023
@MaheshRavishankar MaheshRavishankar force-pushed the pad_fusion_bufferization_workaround branch from a866eb7 to 58fbc23 Compare February 28, 2023 21:02
@github-actions
Copy link

github-actions bot commented Feb 28, 2023

@iree-github-actions-bot
Copy link
Contributor

iree-github-actions-bot commented Mar 1, 2023

Abbreviated Android Benchmark Summary

@ commit 90d6f471f6c891772a465f87f026892f153ffa6d (vs. base 17eafc9a230e31744224026d77cf46982e5c0a17)

Regressed Latencies 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small [fp32,imagenet] (TFLite) little-core,full-inference,experimental-flags with IREE-LLVM-CPU-Sync @ Pixel-6-Pro (CPU-ARMv8.2-A) 85.105 (vs. 70.585, 20.57%↑) 85.226 0.396
PersonDetect [int8] (TFLite) full-inference,default-flags with IREE-Vulkan @ Pixel-6-Pro (GPU-Mali-G78) 3.330 (vs. 2.917, 14.14%↑) 3.335 0.038
MobileBertSquad [fp32] (TFLite) 4-thread,little-core,full-inference,default-flags with IREE-LLVM-CPU @ Pixel-6-Pro (CPU-ARMv8.2-A) 1287.937 (vs. 1191.772, 8.07%↑) 1288.481 9.206

[Top 3 out of 6 results showed]

Improved Latencies 🎉

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TFLite) little-core,full-inference,default-flags with IREE-LLVM-CPU-Sync @ Pixel-6-Pro (CPU-ARMv8.2-A) 176.635 (vs. 195.015, 9.42%↓) 176.615 0.152

For more information:

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, just a nit!

@MaheshRavishankar MaheshRavishankar enabled auto-merge (squash) March 2, 2023 20:49
@MaheshRavishankar MaheshRavishankar merged commit f43f8a3 into iree-org:main Mar 2, 2023
qedawkins pushed a commit to qedawkins/iree that referenced this pull request Apr 2, 2023
It seems like handling the code generated by the tiling of pad operations needs more work in bufferization. To unblock the work of handling pad operations natively in IREE,
iree-org#11273 (comment) is implemented here as a workaround.

To ensure bufferization without allocation, yields of the then and else branch and the result of the scf.if are all tied together. If the then and else come from different bindings, then this would be illegal (because a copy is needed). This example led to adding more constraints on what sets can be merged during the
BufferizationAnalysis to avoid merging sets that have constants or have two different interface_bindings.

benchmarks: x86_64, cuda
@jpienaar jpienaar mentioned this pull request Apr 3, 2023
jpienaar pushed a commit that referenced this pull request May 1, 2023
It seems like handling the code generated by the tiling of pad operations needs more work in bufferization. To unblock the work of handling pad operations natively in IREE,
#11273 (comment) is implemented here as a workaround.

To ensure bufferization without allocation, yields of the then and else branch and the result of the scf.if are all tied together. If the then and else come from different bindings, then this would be illegal (because a copy is needed). This example led to adding more constraints on what sets can be merged during the
BufferizationAnalysis to avoid merging sets that have constants or have two different interface_bindings.

benchmarks: x86_64, cuda
rengolin pushed a commit to plaidml/iree that referenced this pull request May 2, 2023
It seems like handling the code generated by the tiling of pad operations needs more work in bufferization. To unblock the work of handling pad operations natively in IREE,
iree-org#11273 (comment) is implemented here as a workaround.

To ensure bufferization without allocation, yields of the then and else branch and the result of the scf.if are all tied together. If the then and else come from different bindings, then this would be illegal (because a copy is needed). This example led to adding more constraints on what sets can be merged during the
BufferizationAnalysis to avoid merging sets that have constants or have two different interface_bindings.

benchmarks: x86_64, cuda
NatashaKnk pushed a commit to NatashaKnk/iree that referenced this pull request Jul 6, 2023
It seems like handling the code generated by the tiling of pad operations needs more work in bufferization. To unblock the work of handling pad operations natively in IREE,
iree-org#11273 (comment) is implemented here as a workaround.

To ensure bufferization without allocation, yields of the then and else branch and the result of the scf.if are all tied together. If the then and else come from different bindings, then this would be illegal (because a copy is needed). This example led to adding more constraints on what sets can be merged during the
BufferizationAnalysis to avoid merging sets that have constants or have two different interface_bindings.

benchmarks: x86_64, cuda
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
(deprecated) buildkite:benchmark-android Deprecated. Please use benchmarks:android-*
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants