Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Texture support][Part 4] Add CollectStorageInfo and CollectBufferBinds relay passes for Adreno GPU #7689

Closed
wants to merge 19 commits into from
Closed

[Texture support][Part 4] Add CollectStorageInfo and CollectBufferBinds relay passes for Adreno GPU #7689

wants to merge 19 commits into from

Conversation

csullivan
Copy link
Contributor

@csullivan csullivan commented Mar 18, 2021

This PR adds a target specific implementation of CollectStorageInfo and CollectBufferBinds for the opencl --device=adreno target.

CollectStorageInfo

  • Returns a storage mapping for the outputs of each Expr in a Relay function. For Adreno, the output storage is set to "texture" for primitive functions containing a Conv2d with data and kernel layouts of NCHW4c and OIHW4o, respectively, are.
  • The output storage scope for Constants and Vars is inferred from the output storage scope of their consumers.
  • A legalization step occurs that ensures that the output of any primitive function is set to global scope if one or more of its consumers requires global scope for its output. This legalization step occurs after the storage scopes have been collected and therefore is not transitive so that the global scope does not propagate toward the inputs.
  • An additional legalization step occurs to ensure that the global outputs of the relay Function are marked as global scope. This can be removed if CopyFromTo will support passing shape information for the source and destination tensors, or if texture memory reuse is disabled
    • The key issue is that the allocated texture space being read from for a output tensor may be a pool, in which case pitch region and row pitch will be required to correctly read the texture from the pool to the host. The region and pitch information can be inferred from the shape of the src and dst tensors if provided to CopyToFrom.

CollectBufferBinds

  • Translates the storage scopes information provided by the memory planner (and ultimately the above CollectStorageInfo pass) to tir::Buffers that can be bound to te::Tensors when lowering Relay primitive functions. These buffers are provided to CompileEngine and used in tvm::lower/build via the binds field. If the provided buffers have texture scope, the TIR TextureFlattening pass will handle lowering the multidimensional accesses to two dimensions.

See RFC here: https://discuss.tvm.apache.org/t/rfc-texture-memory-support/9467

only produces texture if all consumers support
reading from texture.
primitives containing conv2d in NCHW4c/OIHW4o layout.
in adreno storage scope annotation (CollectStorageInfo).
Ensure that output is marked as global if single
consumer requires global input. Previous legalization
only ensured this if multiple consumers had differing
storage requirements.
…provided

texture buffers. Need to propagate this to allocated textures when
cache_read(texture) is used for weights.
Note for now the Adreno impl. assumes the output storage
types are uniform for multi-output nodes.
outputs to be global storage to avoid need for image
row pitch calc.
 - Max/Avg/Global Pooling
 - Concatenate
 - LayoutTransform (NCHW -> NCHW4c)
@jroesch jroesch added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it status: need review labels Jan 19, 2022
@csullivan
Copy link
Contributor Author

Subsumed by #11878.

@csullivan csullivan closed this Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it status: need review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants