[BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook #11979

mbs-octoml · 2022-06-30T22:17:41Z

This does for the TensorRT integration what #11631 did for the CUTLASS integration. See https://discuss.tvm.apache.org/t/byoc-how-backwards-compatible-does-the-tensorrt-partition-for-tensorrt-function-need-to-be/12957 for discussion.

All compilation options are captured within the attributes of a Target of kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in PassContext). This means all BYOC configurations options needed by Collage can be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used internally at OctoML) only need to worry about maintaining the fidelity of the Target instance(s) rather than reaching into the PassContex.

The API is now:
```
  host_target = tvm.target.Target("llvm")
  cuda_target = tvm.target.Target("cuda", host=host_target) 
  trt_target = tvm.target.Target("tensorrt -use_fp16=True", host=host_target)
  mod = tensorrt.partition_for_tensorrt(mod, params=params, target=trt_target)
  runtime_mod = build(mod, target=[cuda_target, trt_target])
```
Compilation is switched from function-at-a-time (relying on the TECompiler) to IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though not strictly necessary for Collage I want to check the path is now clear to deprecate the support for BYOC in TEComplier.
Get all the TensorRT tests going again, except for a few I've disabled with x-link to a new issue [BYOC] TensorRT testing issues #11765. CAUTION: The TensorRT runtime is not supported in CI so many of these tests are just cosmetic.
While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py made the TensorRT allocs/frees more robust, but turns out it unrelated issue [Bug] PyTorch and TVM loading problem due to conflicting LLVM symbols #9362 (thanks Masa). No harm leaving these changes in though.

areusch · 2022-06-30T23:08:09Z

@masahi @mbaret interested in having a look?

masahi · 2022-07-01T00:18:01Z

While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py

I'm pretty sure this is the LLVM symbol conflict issue, #9362. A simple workaround is always import torch first before tvm like #10342

mbs-octoml · 2022-07-01T15:06:26Z

@masahi thanks, you are right.

…elayToTIR hook This does for the TensorRT integration what apache#11631 did for the CUTLASS integration. - All compilation options are captured within the attributes of a Target of kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in PassContext). This means all BYOC configurations options needed by Collage can be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used internally at OctoML) only need to worry about maintaining the fidelity of the Target instance(s) rather than reaching into the PassContext. - Compilation is switched from function-at-a-time (relying on the TECompiler) to IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though not strictly necessary for Collage I want to check the path is now clear to deprecate the support for BYOC in TEComplier. - Get all the TensorRT tests going again, except for a few I've disabled with x-link to a new issue apache#11765. CAUTION: The TensorRT runtime is not supported in CI so many of these tests are cosmetic. - While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py made the TensorRT allocs/frees more robust, but turns out its also broken in main. No harm leaving these changes in though.

- can't use default Target("tensorrt") arg

- handle missing runtime in versioning - turn test_maskrcnn_resnet50 back on now that we have the import-torch-first workaround.

mbs-octoml · 2022-07-01T20:24:46Z

Pretty sure I've got all the ci failures and this is ready for review.

mbs-octoml · 2022-07-01T22:05:09Z

Thank's @masahi! This can be merged.

…elayToTIR hook (apache#11979) * [BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook This does for the TensorRT integration what apache#11631 did for the CUTLASS integration. - All compilation options are captured within the attributes of a Target of kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in PassContext). This means all BYOC configurations options needed by Collage can be captured uniformly by a list-of-Targets. It also means RPC boundaries (as used internally at OctoML) only need to worry about maintaining the fidelity of the Target instance(s) rather than reaching into the PassContext. - Compilation is switched from function-at-a-time (relying on the TECompiler) to IRModule-at-a-time (using the RelayToTIR target-specific hook mechanism). Though not strictly necessary for Collage I want to check the path is now clear to deprecate the support for BYOC in TEComplier. - Get all the TensorRT tests going again, except for a few I've disabled with x-link to a new issue apache#11765. CAUTION: The TensorRT runtime is not supported in CI so many of these tests are cosmetic. - While trying to track down a 'free(): invalid pointer' error in test_tensorrt_int8_exp.py made the TensorRT allocs/frees more robust, but turns out its also broken in main. No harm leaving these changes in though. * - Lints * - Woops, fix test * - lints * - Use default tensorrt target if none given in targets list * - fix free error * - accidentally introduced 'transforms' namespace - can't use default Target("tensorrt") arg * - D'oh! Include ended up #if protected * - restore mark for test_dynamic_offload - handle missing runtime in versioning - turn test_maskrcnn_resnet50 back on now that we have the import-torch-first workaround. * - wibble

mbs-octoml added 7 commits July 1, 2022 10:33

- Lints

49347a5

- Woops, fix test

b0b0860

- lints

23cf487

- Use default tensorrt target if none given in targets list

8dd0e96

- fix free error

f173fbc

- accidentally introduced 'transforms' namespace

8539ef4

- can't use default Target("tensorrt") arg

mbs-octoml force-pushed the mbs-collage-trt-refresh branch from 747cef3 to 8539ef4 Compare July 1, 2022 17:58

mbs-octoml added 3 commits July 1, 2022 11:40

- D'oh! Include ended up #if protected

a37e745

- restore mark for test_dynamic_offload

f395d9c

- handle missing runtime in versioning - turn test_maskrcnn_resnet50 back on now that we have the import-torch-first workaround.

- wibble

c8e10c9

mbs-octoml mentioned this pull request Jul 1, 2022

[Collage] SubGraphs #11981

Merged

masahi approved these changes Jul 1, 2022

View reviewed changes

masahi merged commit d2a14a6 into apache:main Jul 1, 2022

mbs-octoml deleted the mbs-collage-trt-refresh branch July 1, 2022 22:56

AndrewZhaoLuo mentioned this pull request Oct 4, 2022

TVM v0.10.0.rc0 Release Candidate Notes #12979

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook #11979

[BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook #11979

mbs-octoml commented Jun 30, 2022 •

edited

Loading

areusch commented Jun 30, 2022

masahi commented Jul 1, 2022 •

edited

Loading

mbs-octoml commented Jul 1, 2022

mbs-octoml commented Jul 1, 2022

mbs-octoml commented Jul 1, 2022

[BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook #11979

[BYOC] Switch TensorRT BYOC integration to IRModule-at-a-time using RelayToTIR hook #11979

Conversation

mbs-octoml commented Jun 30, 2022 • edited Loading

areusch commented Jun 30, 2022

masahi commented Jul 1, 2022 • edited Loading

mbs-octoml commented Jul 1, 2022

mbs-octoml commented Jul 1, 2022

mbs-octoml commented Jul 1, 2022

mbs-octoml commented Jun 30, 2022 •

edited

Loading

masahi commented Jul 1, 2022 •

edited

Loading