Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HashDictionary documentation #2073

Merged
merged 4 commits into from
May 31, 2018
Merged

Fix HashDictionary documentation #2073

merged 4 commits into from
May 31, 2018

Conversation

piskvorky
Copy link
Owner

@piskvorky piskvorky commented May 30, 2018

The documentation for HashDictionary used broken English, broken formatting and presented some misleading information. This is confusing to users -- see for example #2049.

This PR attempts to fix the docs.

Copy link

@JanmajaySingh JanmajaySingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Specifically the part about initializing HashDictionary object without passing a corpus and changing the definition of document from "list of strings" to "sequence of strings". Thanks!

* All tokens will be used (not only that you see in documents), typical problem
for :class:`~gensim.corpora.dictionary.Dictionary`.

* Able to represent all tokens (not only those present in training documents)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "Able to represent any token..." would be better wording?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7a38bcd.

documents : iterable of iterable of str
Iterable of documents, if given - use them to initialization.
Iterable of documents. If given, used to collect additional corpus statistics. HashDictionary can work without these statistics (optional parameter).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what was required. Thanks

@piskvorky piskvorky merged commit 5cd21f3 into develop May 31, 2018
@piskvorky piskvorky deleted the hashdictionary_docs branch May 31, 2018 09:38
@piskvorky
Copy link
Owner Author

@menshikh-iv looks like some flake8 test failed -- line too long. I don't think we care for such errors, can you disable it? (and re-run the tests)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants