Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add different swish implementations #88

Merged
merged 1 commit into from
Oct 15, 2019
Merged

Conversation

qubvel
Copy link
Contributor

@qubvel qubvel commented Oct 14, 2019

Add different swish implementations:

  1. Memory efficient swish
  • GPU memory friendly
  • Less computationally efficient while training
  • Does not supported by torch.jit / torch.onnx
  1. Original swish (x * torch.sigmoid(x))
  • Less memory efficient
  • More computationally efficient while training
  • Model can be saved with torch.jit / torch.onnx

Default: memory efficient
Model swish implementation can be changed by .set_swish(memory_efficient=False/True) method

@lukemelas lukemelas merged commit 8a5da1d into lukemelas:master Oct 15, 2019
@glenn-jocher
Copy link

@qubvel thanks for function! I've tried to implement this in our repo: https://github.com/ultralytics/yolov3, but get worse results (lower mAP and higher loss) when compared to a default Swish() class. Do you know why this might be? See ultralytics/yolov3#441 (comment)

@cswwp
Copy link

cswwp commented Aug 5, 2020

@qubvel If i train with Memory efficient swish, and exporting model.pt with model.set_swish(memory_efficient=False) + torch.jit.trace(model, example), will it hurt the score?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants