New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Encoder and Decoder has different TP_SIZE #1121

Open

heavyrain-lzy opened this issue Sep 5, 2024 · 0 comments

heavyrain-lzy commented Sep 5, 2024

Your question

I have a question about creating the pp groups when enabling context_parallel_size > 1 and encoder_tensor_parallel_size != tensor_parallel_size.

When enabling context_parallel, the input will be split symmetrically to balance the calculation. Using zip(cycle(e_ranks), d_ranks) is wrong.

Megatron-LM/megatron/core/parallel_state.py

Line 602 in 4ff9e66

# Map 1 encoder tp rank to several decoder tp ranks, because

Why do we use stack operator to calculate the sum of received tensor.

Megatron-LM/megatron/core/pipeline_parallel/p2p_communication.py

Line 402 in 46ca068

return torch.stack(x, dim=0).sum(dim=0, dtype=torch.float32).to(x[0].dtype)

The text was updated successfully, but these errors were encountered:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment