Add two popular datasets for character level LM #254

entron · 2023-04-25T07:59:56Z

Added data preparation and example trainning config scripts for two popular datasets: text8 and enwik8.

entron · 2023-04-25T11:29:38Z

Also tried out feeding the outputs of each layer to itself mutiple times in the 2nd commit.
For the shakespeare_char dataset, this actually gives better val at 1.4543 with only 1 layer and 1.8M parameters.
For bigger datasets such as text8, this also gives better results when the
number of parameters are the same. Haven't tested on GPT-2 yet.
The 2nd commit may be not so relavant though.

karpathy · 2023-04-26T03:35:53Z

That's nice, but prefer we keep n_layer_update separate

entron · 2023-04-26T05:22:33Z

I have removed the 2nd commit.

Andrei-Aksionov · 2023-04-27T11:03:31Z

Maybe it's about time to have a separate .py file with the shared logic?
Because all prepare.py files for shakespeare and these two new datasets basically do the same thing.
I understand that it's sometimes better to have some code duplication for the sake of simplicity and easiness of understanding, but this is not the case (in my opinion).

I am opened to hear why I am wrong (again 😄 ).

Add two popular datasets for character level LM

316aa1c

entron force-pushed the master branch from 1a9def7 to 93f2c59 Compare April 26, 2023 05:19

entron force-pushed the master branch from 93f2c59 to 316aa1c Compare April 26, 2023 05:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add two popular datasets for character level LM #254

Add two popular datasets for character level LM #254

entron commented Apr 25, 2023 •

edited

Loading

entron commented Apr 25, 2023 •

edited

Loading

karpathy commented Apr 26, 2023

entron commented Apr 26, 2023

Andrei-Aksionov commented Apr 27, 2023

Add two popular datasets for character level LM #254

Are you sure you want to change the base?

Add two popular datasets for character level LM #254

Conversation

entron commented Apr 25, 2023 • edited Loading

entron commented Apr 25, 2023 • edited Loading

karpathy commented Apr 26, 2023

entron commented Apr 26, 2023

Andrei-Aksionov commented Apr 27, 2023

entron commented Apr 25, 2023 •

edited

Loading

entron commented Apr 25, 2023 •

edited

Loading