Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌟 New model addition - GPT-J-6B #12098

Closed
2 of 3 tasks
Xirider opened this issue Jun 10, 2021 · 8 comments · Fixed by #13022
Closed
2 of 3 tasks

🌟 New model addition - GPT-J-6B #12098

Xirider opened this issue Jun 10, 2021 · 8 comments · Fixed by #13022

Comments

@Xirider
Copy link

Xirider commented Jun 10, 2021

🌟 New model addition - GPT-J-6B

Model description

The GPT-J-6B model (GPT-NEO model in Jax with 6B parameters trained on the Pile)

Repo: https://github.com/kingoflolz/mesh-transformer-jax

Weights:
Slim weights (bf16 weights only, for inference, 9GB)
Full weights (including optimizer params, 61GB)

Open source status

  • the model implementation is available: (give details)
  • the model weights are available: (give details)
  • who are the authors: (mention them, if possible by @gh-username)
@product-copy
Copy link

Hello @patrickvonplaten! would you be able to give an estimation for the timeline of the implementation of the this model in huggingface?

@finetunej
Copy link

I have a PR adding support for this model here: #12098

@KopfKrieg
Copy link

I have a PR adding support for this model here: #12098

You probably wanted to link this PR: #12106

@finetunej
Copy link

Yeah, copied the wrong thing somehow.

@Dilyarbuzan
Copy link

@finetuneanon great! Do you know when it will be ready to use from the transformer library? Thnx for the work.

@finetunej
Copy link

Depends on when it will be merged. Until then you can install my branch like this:

pip install git+https://github.com/finetuneanon/transformers@gpt-j

Convert the weights with the conversion script linked from the PR.

@Dilyarbuzan
Copy link

@finetuneanon I did pip install the transformers@gpt-j and I managed to convert the weights through the script you referenced but only thing I'm now struggling with is making the config file. I uploaded the gpt-j-6b.json file to colab but I don't how to make the config variable via AutoConfig class(don't know if that is how it is made). So If you could let me know how to make the config file, I would appreciate it a lot.
this colab file containts all the code.

@finetunej
Copy link

Rename it into config.json, put it into a folder and you should be able to AutoConfig.from_pretrained("whatever-folder")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@KopfKrieg @Xirider @Dilyarbuzan @finetunej @product-copy and others