-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to fine tune the pre-trained GloVe vectors on a custom corpus #189
Comments
There's an option to save the initial parameters, -save-init-param. The
-load-init-param functionality is to read those parameters back in. You
can also read in the parameters from an intermediate model. Look at
the save_params function for the format.
|
Thank you for the help |
@AngledLuffa Also, there is a shuffling step before training. How does that affect the initialization step? Regards, |
Sorry for the late reply. Basically, there's a specific format in glove.c
where an array is written out as a sequence of bytes. You can look at
load_init_params and save_params to see this format. In python, the
equivalent command for writing an int as a sequence of bytes is
int.to_bytes()
|
Hi,
Hope you are having a great time. I need to fine-tune the pre-trained GloVe vectors on a custom corpus and I was wondering how I can do it with the GloVe library. My understanding of fine-tuning is to initialize the value of word vectors (at the beginning of fine-tuning) to the values of the pre-trained word vectors.
There is a parameter in the "glove.c" named "load_init_param". If the value of this parameter is set to "1", then the code will look for a "-init-param-file" file to read the parameters from an input file. I tried to understand what should the format of the initialization file look like and whether initial word vectors are part of this initialization parameter or not, since C is not my programing language, I did not successfully understand all the details of it. I appreciate it if someone can help me initialize the word vectors with pre-trained word vectors to fine-tune the GloVe on my corpus?
Thanks
Maryam
The text was updated successfully, but these errors were encountered: