-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix train error of ConcatenatedDoc2Vec in the notebook of doc2vec-IMDB #1377
Merged
Merged
Changes from 18 commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
1aa3f33
fix the compatibility between python2 & 3
robotcator 24e6331
Merge https://github.com/RaRe-Technologies/gensim into fix-word2vec-n…
robotcator f6f571f
require explicit corpus size, epochs for train()
gojomo 5e9529b
make all train() calls use explicit count, epochs
gojomo 5c24a90
add tests to make sure that ValueError is indeed thrown
robotcator c89f285
update test
robotcator 10ff8a5
fix the word2vec's reset_from()
robotcator a6312ca
Merge branch 'fix-word2vec' into fix-word2vec-notebook
robotcator be5216a
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator 504bd09
require explicit corpus size, epochs for train()
gojomo 43f9689
make all train() calls use explicit count, epochs
gojomo 49e3d00
update notebooks
robotcator c9eab32
fix some error
robotcator 8024eb5
fix test error
robotcator d3562b6
Merge branch 'test-word2vec' of https://github.com/robotcator/gensim …
robotcator ff93cdf
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator 67f0367
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator 8a6098a
fix the train error of ConcatenatedDoc2Vec
robotcator 04cf9cd
update the ConcatenatedDoc2Vec class
robotcator 09a2691
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator 623add0
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator b365e2a
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator 2e15945
update the parameters
robotcator 5306c0a
rerun all the cells
robotcator 2aaecff
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim…
robotcator File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simpler and more robust fix would be to change the ConcatenatedDoc2Vec class, in
test_doc2vec.py
, to make its (no-op)train()
match the newtrain()
parameters-signature.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. If the train() method is modified, the total_examples and epochs should be provided. But the ConcatenatedDoc2Vec class has no attribute 'corpus_count'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call doesn't have to use
train_model.corpus_count
from inside the model - it can just uselen(doc_list)
. And since the outside loop is handling the multiple passes, theepochs
argument should be1
.