Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

fix speedup with CUDA #2947

Merged
merged 2 commits into from
Oct 12, 2020
Merged

fix speedup with CUDA #2947

merged 2 commits into from
Oct 12, 2020

Conversation

chicm-ms
Copy link
Contributor

No description provided.

@zheng-ningxin
Copy link
Contributor

If the mask tensor is on the wrong device, it should fail on the speedup unit test?
Why the previous code can pass the speedup unit test

@chicm-ms
Copy link
Contributor Author

If the mask tensor is on the wrong device, it should fail on the speedup unit test?
Why the previous code can pass the speedup unit test

in v1.8 the only usage of the created mask in view_inshape is here:
https://github.com/microsoft/nni/blob/v1.8/src/sdk/pynni/nni/compression/torch/speedup/compress_modules.py#L56
it calls to(device) to make the two tensor on same device before use it.

in v1.9
there are more usage like here, the two tensors must be on same device before comparing:
https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/compression/torch/speedup/infer_shape.py#L757

@QuanluZhang QuanluZhang merged commit e5a208b into microsoft:v1.9 Oct 12, 2020
@chicm-ms chicm-ms deleted the fix_speedup_cuda branch October 19, 2020 15:59
Copy link

@Tudor33 Tudor33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/sdk/pynni/nni/compression/torch/speedup/infer_shape.py

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants