-
Notifications
You must be signed in to change notification settings - Fork 26.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌟 New model addition - GPT-J-6B #12098
Comments
Hello @patrickvonplaten! would you be able to give an estimation for the timeline of the implementation of the this model in huggingface? |
I have a PR adding support for this model here: #12098 |
Yeah, copied the wrong thing somehow. |
@finetuneanon great! Do you know when it will be ready to use from the transformer library? Thnx for the work. |
Depends on when it will be merged. Until then you can install my branch like this:
Convert the weights with the conversion script linked from the PR. |
@finetuneanon I did pip install the transformers@gpt-j and I managed to convert the weights through the script you referenced but only thing I'm now struggling with is making the config file. I uploaded the gpt-j-6b.json file to colab but I don't how to make the config variable via AutoConfig class(don't know if that is how it is made). So If you could let me know how to make the config file, I would appreciate it a lot. |
Rename it into config.json, put it into a folder and you should be able to |
🌟 New model addition - GPT-J-6B
Model description
The GPT-J-6B model (GPT-NEO model in Jax with 6B parameters trained on the Pile)
Repo: https://github.com/kingoflolz/mesh-transformer-jax
Weights:
Slim weights (bf16 weights only, for inference, 9GB)
Full weights (including optimizer params, 61GB)
Open source status
The text was updated successfully, but these errors were encountered: