-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add word2vec.PathLineSentences for reading a directory as a corpus (#1364) #1423
Commits on Jun 16, 2017
-
issue piskvorky#1364 first commit, corpus from a directory
added method models.word2vec.LineSentencePath method to read an entire directory's files in the same style as models.word2vec.LineSentence
Michael Sherman committedJun 16, 2017 Configuration menu - View commit details
-
Copy full SHA for 44fb606 - Browse repository at this point
Copy the full SHA 44fb606View commit details -
test for word2vec.LineSentencePath issue piskvorky#1364
initial attempt at test, including files. test just splits the lee_background.cor file into two parts and puts them in a directory, then makes sure they match the unsplit file as loaded by word2vec.LineSentence
Michael Sherman committedJun 16, 2017 Configuration menu - View commit details
-
Copy full SHA for 0a62352 - Browse repository at this point
Copy the full SHA 0a62352View commit details -
better handling of input for LineSentencePath
no longer sensitive to an input without a trailing os-specific slash
Michael Sherman committedJun 16, 2017 Configuration menu - View commit details
-
Copy full SHA for b55a844 - Browse repository at this point
Copy the full SHA b55a844View commit details -
Merge branch 'LineSentencePath' into develop
Michael Sherman committedJun 16, 2017 Configuration menu - View commit details
-
Copy full SHA for bde9cfd - Browse repository at this point
Copy the full SHA bde9cfdView commit details
Commits on Jun 19, 2017
-
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim …
…into develop
Michael Sherman committedJun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for 86517a8 - Browse repository at this point
Copy the full SHA 86517a8View commit details -
LineSentencePath renamed PathLineSentences
in word2vec.py . Test updated as well
Michael Sherman committedJun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for aef2879 - Browse repository at this point
Copy the full SHA aef2879View commit details -
LineSentencePath rename to PathLineSentences
in models.word2vec . Tests also updated
Michael Sherman committedJun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for 6a21b80 - Browse repository at this point
Copy the full SHA 6a21b80View commit details -
had only 1 space before an inline comment, flagged by travis CI build
Michael Sherman committedJun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for f362e33 - Browse repository at this point
Copy the full SHA f362e33View commit details -
updated PathLineSentences test and test data
Removed LineSentencePath directory, created PathLineSentences lee corpus duplicates were in LineSentencePath, was wasting space made new small corpus to test PathLineSentences, put in directory changed test to read both files manually, combine, and compare to PathLineSentences (rather than having a separate single file to match the entire contents of the PathLineSentences test_data directory
Michael Sherman committedJun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for 1dbe7b6 - Browse repository at this point
Copy the full SHA 1dbe7b6View commit details -
word2vec.PathLineSentences single file support
changed PathLineSentences to support a single file in addition to a directory, raises a warning to use LineSentence when a single file is given as a parameter. added corresponding test.
Michael Sherman committedJun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for ac49054 - Browse repository at this point
Copy the full SHA ac49054View commit details -
Michael Sherman committed
Jun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for bda1fe7 - Browse repository at this point
Copy the full SHA bda1fe7View commit details -
Michael Sherman committed
Jun 19, 2017 Configuration menu - View commit details
-
Copy full SHA for 83eb848 - Browse repository at this point
Copy the full SHA 83eb848View commit details
Commits on Jun 21, 2017
-
Configuration menu - View commit details
-
Copy full SHA for dfd1f8e - Browse repository at this point
Copy the full SHA dfd1f8eView commit details
Commits on Jun 23, 2017
-
Merge branch 'develop' into LineSentencePath
resolved test_word2vec.py manually
Michael Sherman committedJun 23, 2017 Configuration menu - View commit details
-
Copy full SHA for 4125143 - Browse repository at this point
Copy the full SHA 4125143View commit details -
Merge branch 'master' of https://github.com/RaRe-Technologies/gensim …
…into develop
Michael Sherman committedJun 23, 2017 Configuration menu - View commit details
-
Copy full SHA for 14c2265 - Browse repository at this point
Copy the full SHA 14c2265View commit details -
Merge branch 'develop' of https://github.com/RaRe-Technologies/gensim …
…into develop
Michael Sherman committedJun 23, 2017 Configuration menu - View commit details
-
Copy full SHA for 45b92f2 - Browse repository at this point
Copy the full SHA 45b92f2View commit details