Why MultiGPU dp seems slower? #1005
Answered
by
williamFalcon
ricardorei
asked this question in
DDP / multi-GPU / multi-node
-
❓ Questions and HelpHaving 2 gpus with DP seems to be slowers than using just 1. Is it normal?My intuition is that if you are using 2 GPUs and the batch is being splitted into 2 batches, this should be faster. But when I tested the same code using 1 vs >1 my epoch time increased CodeMinimalist Implementation of a BERT Sentence Classifier What have you tried?I also tried to run ddp but my code seems to break with a What's your environment?
|
Beta Was this translation helpful? Give feedback.
Answered by
williamFalcon
Mar 3, 2020
Replies: 1 comment
-
you should double your batch size. also try ddp |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
Borda
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
you should double your batch size.
dp still has overhead in communication, so it won't be linear scaling.
also try ddp