Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook #11979

Merged
merged 10 commits into from
Jul 1, 2022

Conversation

mbs-octoml
Copy link
Contributor

@mbs-octoml mbs-octoml commented Jun 30, 2022

This does for the TensorRT integration what #11631 did for the CUTLASS integration. See https://discuss.tvm.apache.org/t/byoc-how-backwards-compatible-does-the-tensorrt-partition-for-tensorrt-function-need-to-be/12957 for discussion.

  • All compilation options are captured within the attributes of a Target of   kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in PassContext). This means all BYOC configurations options needed by Collage can be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used internally at OctoML) only need to worry about maintaining the fidelity of the Target instance(s) rather than reaching into the PassContex.

    The API is now:

      host_target = tvm.target.Target("llvm")
      cuda_target = tvm.target.Target("cuda", host=host_target) 
      trt_target = tvm.target.Target("tensorrt -use_fp16=True", host=host_target)
      mod = tensorrt.partition_for_tensorrt(mod, params=params, target=trt_target)
      runtime_mod = build(mod, target=[cuda_target, trt_target])
    
  • Compilation is switched from function-at-a-time (relying on the TECompiler) to IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though not strictly necessary for Collage I want to check the path is now clear to   deprecate the support for BYOC in TEComplier.

  • Get all the TensorRT tests going again, except for a few I've disabled with x-link to a new issue [BYOC] TensorRT testing issues  #11765. CAUTION: The TensorRT runtime is not supported in CI so many of these tests are just cosmetic.

  • While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py made the TensorRT allocs/frees more robust, but turns out it unrelated issue [Bug] PyTorch and TVM loading problem due to conflicting LLVM symbols #9362 (thanks Masa). No harm leaving these changes in though.

@areusch
Copy link
Contributor

areusch commented Jun 30, 2022

@masahi @mbaret interested in having a look?

@masahi
Copy link
Member

masahi commented Jul 1, 2022

While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py

I'm pretty sure this is the LLVM symbol conflict issue, #9362. A simple workaround is always import torch first before tvm like #10342

@mbs-octoml
Copy link
Contributor Author

@masahi thanks, you are right.

…elayToTIR hook

This does for the TensorRT integration what apache#11631 did for the CUTLASS integration.

- All compilation options are captured within the attributes of a Target of
  kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in
  PassContext). This means all BYOC configurations options needed by Collage can
  be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used
  internally at OctoML) only need to worry about maintaining the fidelity of the
  Target instance(s) rather than reaching into the PassContext.

- Compilation is switched from function-at-a-time (relying on the TECompiler) to
  IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though
  not strictly necessary for Collage I want to check the path is now clear to
  deprecate the support for BYOC in TEComplier.

- Get all the TensorRT tests going again, except for a few I've disabled with
  x-link to a new issue apache#11765. CAUTION: The TensorRT runtime is not supported in
  CI so many of these tests are cosmetic.

- While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py
  made the TensorRT allocs/frees more robust, but turns out its also broken in main.
  No harm leaving these changes in though.
- can't use default Target("tensorrt") arg
- handle missing runtime in versioning
- turn test_maskrcnn_resnet50 back on now that we have the
  import-torch-first workaround.
@mbs-octoml
Copy link
Contributor Author

Pretty sure I've got all the ci failures and this is ready for review.

@mbs-octoml mbs-octoml mentioned this pull request Jul 1, 2022
@mbs-octoml
Copy link
Contributor Author

Thank's @masahi! This can be merged.

@masahi masahi merged commit d2a14a6 into apache:main Jul 1, 2022
@mbs-octoml mbs-octoml deleted the mbs-collage-trt-refresh branch July 1, 2022 22:56
blackkker pushed a commit to blackkker/tvm that referenced this pull request Jul 7, 2022
…elayToTIR hook (apache#11979)

* [BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook

This does for the TensorRT integration what apache#11631 did for the CUTLASS integration.

- All compilation options are captured within the attributes of a Target of
  kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in
  PassContext). This means all BYOC configurations options needed by Collage can
  be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used
  internally at OctoML) only need to worry about maintaining the fidelity of the
  Target instance(s) rather than reaching into the PassContext.

- Compilation is switched from function-at-a-time (relying on the TECompiler) to
  IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though
  not strictly necessary for Collage I want to check the path is now clear to
  deprecate the support for BYOC in TEComplier.

- Get all the TensorRT tests going again, except for a few I've disabled with
  x-link to a new issue apache#11765. CAUTION: The TensorRT runtime is not supported in
  CI so many of these tests are cosmetic.

- While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py
  made the TensorRT allocs/frees more robust, but turns out its also broken in main.
  No harm leaving these changes in though.

* - Lints

* - Woops, fix test

* - lints

* - Use default tensorrt target if none given in targets list

* - fix free error

* - accidentally introduced 'transforms' namespace
- can't use default Target("tensorrt") arg

* - D'oh! Include ended up #if protected

* - restore mark for test_dynamic_offload
- handle missing runtime in versioning
- turn test_maskrcnn_resnet50 back on now that we have the
  import-torch-first workaround.

* - wibble
masahi pushed a commit to masahi/tvm that referenced this pull request Jul 15, 2022
…elayToTIR hook (apache#11979)

* [BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook

This does for the TensorRT integration what apache#11631 did for the CUTLASS integration.

- All compilation options are captured within the attributes of a Target of
  kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in
  PassContext). This means all BYOC configurations options needed by Collage can
  be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used
  internally at OctoML) only need to worry about maintaining the fidelity of the
  Target instance(s) rather than reaching into the PassContext.

- Compilation is switched from function-at-a-time (relying on the TECompiler) to
  IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though
  not strictly necessary for Collage I want to check the path is now clear to
  deprecate the support for BYOC in TEComplier.

- Get all the TensorRT tests going again, except for a few I've disabled with
  x-link to a new issue apache#11765. CAUTION: The TensorRT runtime is not supported in
  CI so many of these tests are cosmetic.

- While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py
  made the TensorRT allocs/frees more robust, but turns out its also broken in main.
  No harm leaving these changes in though.

* - Lints

* - Woops, fix test

* - lints

* - Use default tensorrt target if none given in targets list

* - fix free error

* - accidentally introduced 'transforms' namespace
- can't use default Target("tensorrt") arg

* - D'oh! Include ended up #if protected

* - restore mark for test_dynamic_offload
- handle missing runtime in versioning
- turn test_maskrcnn_resnet50 back on now that we have the
  import-torch-first workaround.

* - wibble
mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023
…elayToTIR hook (apache#11979)

* [BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook

This does for the TensorRT integration what apache#11631 did for the CUTLASS integration.

- All compilation options are captured within the attributes of a Target of
  kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in
  PassContext). This means all BYOC configurations options needed by Collage can
  be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used
  internally at OctoML) only need to worry about maintaining the fidelity of the
  Target instance(s) rather than reaching into the PassContext.

- Compilation is switched from function-at-a-time (relying on the TECompiler) to
  IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though
  not strictly necessary for Collage I want to check the path is now clear to
  deprecate the support for BYOC in TEComplier.

- Get all the TensorRT tests going again, except for a few I've disabled with
  x-link to a new issue apache#11765. CAUTION: The TensorRT runtime is not supported in
  CI so many of these tests are cosmetic.

- While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py
  made the TensorRT allocs/frees more robust, but turns out its also broken in main.
  No harm leaving these changes in though.

* - Lints

* - Woops, fix test

* - lints

* - Use default tensorrt target if none given in targets list

* - fix free error

* - accidentally introduced 'transforms' namespace
- can't use default Target("tensorrt") arg

* - D'oh! Include ended up #if protected

* - restore mark for test_dynamic_offload
- handle missing runtime in versioning
- turn test_maskrcnn_resnet50 back on now that we have the
  import-torch-first workaround.

* - wibble
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants