-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Added 'score' function for LdaModel's sklearn wrapper #1445
[MRG] Added 'score' function for LdaModel's sklearn wrapper #1445
Conversation
""" | ||
Sklearn wrapper for LDA model. derived class for gensim.model.LdaModel . | ||
Sklearn wrapper for LDA model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you remove the original class name (gensim.model.LdaModel)? The user will find it more difficult to find the documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And description for scorer
parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- add perplexity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@menshikh-iv I removed the comment "derived class for gensim.model.LdaModel" because the class SklLdaModel
is no longer a derived class of LdaModel
class (we are now using composite design pattern).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I said inaccurate. I mean that it's a good idea to store "full classpath" in docstring (gensim.model.LdaModel)
|
||
def score(self, X, y=None): | ||
""" | ||
Compute score reflecting how well the model has fit for the input data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe the change of perplexity sign in docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@dsquareindia @macks22 I have been trying to check the scoring options available in Since both of you have worked with topic coherence in the past, I thought you might be able to help me resolve this issue. :) |
@chinmayapancholi13 The naming conventions used throughout the coherence code generally reflect the namings used in the original paper Exploring the Space of Topic Coherence Measures. The solution to this is to make sure that all words in your topn topic lists are present in the corpus you are using to calculate coherence. However, it would also be useful to add some sort of error handling for zero division and include a useful message that explains the issue and re-raises. |
Thank you @chinmayapancholi13 |
This PR adds
score
function for scikit-learn wrappers which would also be inherently used by functions likeGridSearchCV
.