-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should Doc2Vec.load_word2vec_format return a Doc2Vec instance? #322
Comments
Previously, Doc2Vec didn't have separate container for document vectors (docvecs) and probably that's why returning word2vec object was not a big issue. But now if it returns Word2Vec then it loses all information about docvecs... =( |
It's not clear to me what the most useful behavior would be in this case. Do you want to influence a Doc2Vec session with reused word vectors? In such a case, you might be able to cobble together the desired effect using a multi-step process that at some point uses the Or is it that you saved a prior-version Doc2Vec model in _word2vec_format, so it also has doc vectors mixed with words in that format, and you want to convert it forward? Since many conventions for naming the doc-vecs are possible, that'd require some user-specific coding, I think, but still might be possible leveraging |
I don't get your answer. It's little bit unclear for me. My use case is simple. And now I want to load this binary format. How can I do that ? |
I recommend you just use the plain (gensim-native) (The word2vec.c format was only meant for string-keyed vectors – which the docvecs won't be if you're being maximally memory efficient. And it never saved all the model information. So I think you'd only want to use it if needing to maintain compatibility with other code.)
|
Thank You |
Currently
Doc2Vec.load_word2vec_format
returns aWord2Vec
object, shouldn't it be aDoc2Vec
object?The text was updated successfully, but these errors were encountered: