-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add verification when summarize_corpus returns null. Fix #1531. #1570
Conversation
gensim/summarization/summarizer.py
Outdated
# If couldn't get important docs, the algorithm ends. | ||
if not most_important_docs: | ||
logger.warning("Couldn't get relevant sentences.") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising an exception better? I'm not sure whether this is an error state, or just a warning.
Many people don't have logging enabled, and the docstring implies the result of this function is a string (not None
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is an error. Perhaps it makes even more sense to actually return the entire text as the summary wasn't possible, but this will break compatibility with the old behavior.
Regarding the docstring, the method returns a string or a list if the split
parameter was set to true, so perhaps the best thing to do is:
if not most_important_docs:
logger.warning("Couldn't get relevant sentences.")
return [] if split else ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like a good solution 👍
Except we probably want to work with unicode in general (""
=> u""
, if the rest of the code uses proper unicode too).
CC @menshikh-iv
- Returns empty list on border case of summarize_corpus. - Returns empty string or empty list on border case of summarize. - Fixed test accordingly. - Removed some test code repetition.
Thank you @fbarrios |
No description provided.