fix speedup with CUDA #2947

chicm-ms · 2020-10-12T06:11:35Z

No description provided.

zheng-ningxin · 2020-10-12T08:02:21Z

If the mask tensor is on the wrong device, it should fail on the speedup unit test?
Why the previous code can pass the speedup unit test

chicm-ms · 2020-10-12T08:22:48Z

If the mask tensor is on the wrong device, it should fail on the speedup unit test?
Why the previous code can pass the speedup unit test

in v1.8 the only usage of the created mask in view_inshape is here:
https://github.com/microsoft/nni/blob/v1.8/src/sdk/pynni/nni/compression/torch/speedup/compress_modules.py#L56
it calls to(device) to make the two tensor on same device before use it.

Tudor33

src/sdk/pynni/nni/compression/torch/speedup/infer_shape.py

chicm-ms added 2 commits October 12, 2020 14:10

fix speedup with CUDA

d27112e

updates

e8c4a57

chicm-ms requested review from QuanluZhang and zheng-ningxin October 12, 2020 07:30

QuanluZhang approved these changes Oct 12, 2020

View reviewed changes

zheng-ningxin approved these changes Oct 12, 2020

View reviewed changes

QuanluZhang merged commit e5a208b into microsoft:v1.9 Oct 12, 2020

chicm-ms deleted the fix_speedup_cuda branch October 19, 2020 15:59

Tudor33 reviewed Oct 20, 2020

View reviewed changes

Provide feedback