-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
track training loss while using doc2vec issue. #2983
Comments
Loss-tallying has never yet been implemented for Gensim's So, there's not yet reliable hooks for early-stopping in any of the *2Vec models. |
Then How do I choose the best model? Should I just blindly train the model for a lot of Epochs than just standard 20 epochs 5 iterations? Will that give any better results? Do you happen to know 2Vec models by libraries other than gensim that can do this? |
The internal loss can't tell you what's the best model for a downstream purpose, only that the model isn't benefitting on its internal goals from further training. (A model settling at a lower internal loss may be worse, for some outside purpose, than one settling at a higher internal loss.) So, a lot of trial-and-error – though perhaps assisted with automated parameter search – is involved in picking the best model. (When you say "standard 20 epochs 5 iterations", I suspect you might be making a common training mistake, since those usually shouldn't be separate value. But your code excerpt doesn't show your call(s) to I don't know of any library offering loss-reporting from a |
is there a solution here? I am getting a training loss of 0 for every epoch and after the first epoch, the results are pretty nice but after the second, they are terrible. Yet it's a black box and I have no ability to monitor the loss. thoughts? is there another implementation of Word2Vec outside of gensim? |
@griff4692 - There are other word2vec options; but I'm not familiar with an alternate Python implementation of the "Paragraph Vectors" algorithm (aka If results after one epoch are good, but after more epochs are bad, there are probably other serious errors in your code which would need to be reviewed to be discovered. (That is: improvising an early-stop via loss-monitoring is probably the wrong fix.) See for example this SO answer about some really-misguided code that's unfortunately very common in oft-mimicked low-quality online examples: Real improvement to the loss-tracking in Gensim's |
Problem description
I am trying to track training loss using doc2vec algorithm. And it failed. Is there a way to track training loss in doc2vec?
Also, I didnt find any documentation related to performing early stopping while do2vec training phase?
the similarity score is varying a lot based on epochs, and I want to stop training when it has reached optimal capacity with callbacks. I have used keras, it has earlystopping feature. Not sure how to do it using gensim models.
Any response is appreciated. Thank you!
Steps/code/corpus to reproduce
Versions
Please provide the output of:
The text was updated successfully, but these errors were encountered: