Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Features] Support FP16 training #198

Merged
merged 2 commits into from
Feb 3, 2023
Merged

Conversation

rentainhe
Copy link
Collaborator

@rentainhe rentainhe commented Feb 2, 2023

TODO

  • support fp16 training, which will reduce 20-30% GPU memory usage.
  • fp16 training baseline dino-r50-4scale-12ep: 49.1 AP (with amp) vs 49.2 AP (w/o amp)

Note

For MultiScaleDeformableAttention, we simply convert the input value to torch.float32 and convert the output from torch.float32 to torch.float16, which means we skip fp16 and conduct fp32 computation in MultiScaleDeformableAttention operator.

Usage

start fp16 training with train.amp.enabled:

python tools/train_net.py \
    --config-file projects/dab_detr/configs/dab_detr_r50_50ep.py \
    --num-gpus 8 \
    train.amp.enabled=True

@rentainhe rentainhe changed the title Support FP16 training [Features] Support FP16 training Feb 2, 2023
@zengzhaoyang zengzhaoyang merged commit 79f8bb7 into main Feb 3, 2023
@rentainhe rentainhe deleted the support_fp16_training branch February 3, 2023 02:23
@FabianSchuetze
Copy link

Thanks for this commit!

Did you observe instabilities when using the deformable attention layer with fp16? Is there another reason why the deformable attention layer cannot be used with fp16?

Lontoone pushed a commit to Lontoone/detrex that referenced this pull request Jan 8, 2024
…training

[Features] Support FP16 training
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants