Word2vec in Pytorch

This repo has learnt a lot from this repo

This repo implements the SkipGram model with negative sampling of Word2vec by Mikolov.

Tricks below are also implemented:

subsampling
negative sampling with pow weight decay
learning rate decay

Requirements

PyTorch >= 0.4.1
Gensim >= 3.6.0 (for testing only)

Fast run

To quickly run the train model, just run

python train.py

which uses a Chinese corpus to train the Word2vec model. There is another toy corpus in English you can use located in data/trainset.txt, which is actually a "Jane Eyre" novel.

Issues and PRs are welcomed!

Reference

[1] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems. 2013: 3111-3119.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
word2vec		word2vec
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word2vec in Pytorch

Requirements

Fast run

Reference

About

Releases

Packages

Languages

PengFoo/word2vec-pytorch

Folders and files

Latest commit

History

Repository files navigation

Word2vec in Pytorch

Requirements

Fast run

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages