🌟 New model addition - GPT-J-6B #12098

Xirider · 2021-06-10T07:26:01Z

🌟 New model addition - GPT-J-6B

Model description

The GPT-J-6B model (GPT-NEO model in Jax with 6B parameters trained on the Pile)

Repo: https://github.com/kingoflolz/mesh-transformer-jax

Weights:
Slim weights (bf16 weights only, for inference, 9GB)
Full weights (including optimizer params, 61GB)

Open source status

the model implementation is available: (give details)
the model weights are available: (give details)
who are the authors: (mention them, if possible by @gh-username)

product-copy · 2021-06-10T15:34:17Z

Hello @patrickvonplaten! would you be able to give an estimation for the timeline of the implementation of the this model in huggingface?

finetunej · 2021-06-11T10:42:38Z

I have a PR adding support for this model here: #12098

KopfKrieg · 2021-06-11T11:32:28Z

I have a PR adding support for this model here: #12098

You probably wanted to link this PR: #12106

finetunej · 2021-06-11T11:50:26Z

Yeah, copied the wrong thing somehow.

Dilyarbuzan · 2021-06-11T12:49:00Z

@finetuneanon great! Do you know when it will be ready to use from the transformer library? Thnx for the work.

finetunej · 2021-06-11T14:25:14Z

Depends on when it will be merged. Until then you can install my branch like this:

pip install git+https://github.com/finetuneanon/transformers@gpt-j

Convert the weights with the conversion script linked from the PR.

Dilyarbuzan · 2021-06-13T11:41:31Z

@finetuneanon I did pip install the transformers@gpt-j and I managed to convert the weights through the script you referenced but only thing I'm now struggling with is making the config file. I uploaded the gpt-j-6b.json file to colab but I don't how to make the config variable via AutoConfig class(don't know if that is how it is made). So If you could let me know how to make the config file, I would appreciate it a lot.
this colab file containts all the code.

finetunej · 2021-06-13T13:46:31Z

Rename it into config.json, put it into a folder and you should be able to AutoConfig.from_pretrained("whatever-folder")

Xirider added the New model label Jun 10, 2021

StellaAthena mentioned this issue Aug 6, 2021

GPT-J-6B #13022

Merged

5 tasks

patrickvonplaten closed this as completed in #13022 Aug 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🌟 New model addition - GPT-J-6B #12098

🌟 New model addition - GPT-J-6B #12098

Xirider commented Jun 10, 2021

product-copy commented Jun 10, 2021

finetunej commented Jun 11, 2021

KopfKrieg commented Jun 11, 2021

finetunej commented Jun 11, 2021

Dilyarbuzan commented Jun 11, 2021

finetunej commented Jun 11, 2021

Dilyarbuzan commented Jun 13, 2021

finetunej commented Jun 13, 2021

🌟 New model addition - GPT-J-6B #12098

🌟 New model addition - GPT-J-6B #12098

Comments

Xirider commented Jun 10, 2021

🌟 New model addition - GPT-J-6B

Model description

Open source status

product-copy commented Jun 10, 2021

finetunej commented Jun 11, 2021

KopfKrieg commented Jun 11, 2021

finetunej commented Jun 11, 2021

Dilyarbuzan commented Jun 11, 2021

finetunej commented Jun 11, 2021

Dilyarbuzan commented Jun 13, 2021

finetunej commented Jun 13, 2021