-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LsiModel.docs_processed attribute #763
Conversation
@@ -395,6 +396,7 @@ def add_documents(self, corpus, chunksize=None, decay=None): | |||
if self.dispatcher: | |||
logger.info("reached the end of input; now waiting for all remaining jobs to finish") | |||
self.projection = self.dispatcher.getstate() | |||
self.docs_processed += len(corpus) if hasattr(corpus, '__len__') else doc_no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we always use doc_no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I guess so, for this line.
@@ -16,14 +16,17 @@ | |||
|
|||
import numpy | |||
|
|||
from gensim.utils import to_unicode, smart_extension | |||
from gensim.utils import to_unicode # , smart_extension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove import if no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Side Note: There are many more unused imports throughout gensim. They can be dangerous to remove, though, for someone like me unfamiliar with the internals of those packages being imported. For example import seaborn
has side-effects, and obviously from future import division
does too.
Nice! These type of fixes are very valuable. What was your motivation for this PR @hobson ? |
Seems less explicit, less pythonic to me, but happy to do it if you like. I was debugging my training on an iterable QuerySetCorpus class for a On Wed, Jun 29, 2016, 8:55 PM Radim Řehůřek notifications@github.com
|
Agree. |
Just a line in the CHANGELOG and then will merge. |
unittests in test_lsimodel.py and test_corpora.py
works for test MmCorpus in unittests
also works for large custom corpora in production app