-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generation with model parallel Megatron LM #2358
Comments
I think, this is on me, I never got around to fixing generate script with megatron MP. I can look into it, but no promises on timeline yet. Maybe give that a try here? |
Sure. I can try that. Appreciate if you can give an example that shows how to stitch model parts to one. |
Can anyone confirm that Megatron 11b treats all contiguous spaces as a single space? With some hacky code I have it successfully generating on 2 GPUs (after merging and re-splitting the partitions) but it doesn't seem to understand line breaks. That's a little disappointing since it seems smarter than GPT-2 in a lot of other ways. Perhaps this code was used during training? |
Does anyone have an example of stitching the model parts together? Did the approach work to generate text with megatron_11b? |
To join the model chunks maybe one could try like this: ps. make sure to copy the 'version' entries as well, you might loose normalization layers otherwise. |
|
Hello @ngoyal2707, did you get a chance to push the scripts to manage the partitions somewhere? |
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment! |
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you! |
❓ Questions and Help
What is your question?
Is there an example demonstrating how to generate using Megatron LM that was trained using model parallelism? The Megatron LM page shows how to run evaluation but there's no information on running generation.
What have you tried?
I tried running the below command but got an error.
Command:
Error:
/opt/conda/conda-bld/pytorch_1579022034529/work/aten/src/THC/THCTensorScatterGather.cu:100: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [3,0,0] Assertion
indexValue >= 0 && indexValue < src.sizes[dim]failed.
After some debugging, I found that this line in the code caused the above error. But I'm unsure of the cause. It's possible there are some setup issues (data etc). But an example on how to setup and run generation using model parallel megatron LM would be great. Thank you.
The text was updated successfully, but these errors were encountered: