-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node2vec uses CBOW instead of skip-gram #40
Comments
Thanks, can you confirm empirically that 1 is better than 0? If so, I'll change the default along with other udpates this week. |
It is for my graphs, but I'm not sure if this is always the case. There is some discussion about which one is better, but in the context of NLP. I couldn't find anyone discussing that in the context of graph embeddings. Anyway, I suggest you to not only use 1 as the default, but also force |
Check the reference below for skip-gram vs CBOW. The quality difference between the two seems to be the fault of a longstanding implementation error in the original word2vec and Gensim implementations. İrsoy, Ozan, Adrian Benton, and Karl Stratos. “Koan: A Corrected CBOW Implementation.” ArXiv:2012.15332 [Cs, Stat], December 30, 2020. http://arxiv.org/abs/2012.15332. |
Just passing through, noticed this issue in the course of answering someone's Node2Vec/Word2Vec interaction question, thought I'd mention:
|
Node2vec and DeepWalk original proposals are built upon the skip-gram model. By default, nodevectors does not set the parameter
w2vparams["sg"]
to 1, therefore the underlying Word2Vec model uses the default value of 0, which means using CBOW instead of skip-gram. This has major consequences in the quality of the embeddings.The text was updated successfully, but these errors were encountered: