-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in Phrases.export_phrases() #794
Comments
Good find! I believe the intent of So perhaps best fix is to add |
Hi everyone! |
As reported on discussion list (https://groups.google.com/forum/#!topic/gensim/N0nMD95N6Iw), the fix as applied wasn't really effective. I think my suggestion is still valid: the line just needs to be at the end of the loop, outside any conditionals. (Specifically, de-indented two levels.) The test-for-two-returned-phrases that was added didn't really probe the behavior of Reopening. Best approach will be to make a valid failing test first, then try the de-indentation as a fix. |
Fixed in #1362 |
A small bug in
bigram.export_phrases(sentences)
causes it to return a maximum of one bigram per sentence.For example:
Returns:
The last sentence has two bigrams, but only the first is returned. This happens because this line
https://github.com/RaRe-Technologies/gensim/blob/master/gensim/models/phrases.py#L216
prevents further iteration within a sentence.
If it's commented out, it correctly returns:
The text was updated successfully, but these errors were encountered: