Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update non-API docs #2101

Merged
merged 14 commits into from
Jun 27, 2018
Merged

Update non-API docs #2101

merged 14 commits into from
Jun 27, 2018

Conversation

piskvorky
Copy link
Owner

Updating ancillary docs like About, Intro, Install etc.

@piskvorky piskvorky added the documentation Current issue related to documentation label Jun 22, 2018

or, alternatively::

pip install --upgrade gensim
easy_install -U gensim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easy_install no more relevant (all using pip or conda now), better to replace easy_install with conda install -c conda-forge gensim.
In other places - use pip always please

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Once these statistical patterns are found, any plain text documents can be succinctly
expressed in the new, semantic representation and queried for topical similarity
against other documents.
The algorithms in Gensim, such as **Word2Vec**, **FastText**, **Latent Semantic Analysis**, **Latent Dirichlet Allocation** and **Random Projections**, discover semantic structure of documents by examining statistical co-occurrence patterns within a corpus of training documents. These algorithms are **unsupervised**, which means no human input is necessary -- you only need a corpus of plain text documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add links to documentation pages for each model, btw, who use RandomProjections? Maybe better to add something like TfIdfModel or Doc2Vec (but not RandomProjections)

Copy link
Owner Author

@piskvorky piskvorky Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. (but random projections are pretty cool, just not as hip at the moment)

@@ -28,9 +28,6 @@ platform that supports Python 2.6+ and NumPy. Gensim depends on the following so
* `NumPy <http://www.numpy.org>`_ >= 1.3. Tested with version 1.9.0, 1.7.1, 1.7.0, 1.6.2, 1.6.1rc2, 1.5.0rc1, 1.4.0, 1.3.0, 1.3.0rc2.
* `SciPy <http://www.scipy.org>`_ >= 0.7. Tested with version 0.14.0, 0.12.0, 0.11.0, 0.10.1, 0.9.0, 0.8.0, 0.8.0b1, 0.7.1, 0.7.0.

**Windows users** are well advised to try the `Enthought distribution <http://www.enthought.com/products/epd.php>`_,
which conveniently includes Python & NumPy & SciPy in a single bundle, and is free for academic use.


Install Python and `easy_install`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this section


-----

There are also alternative routes to install:

1. If you have downloaded and unzipped the `tar.gz source <http://pypi.python.org/pypi/gensim>`_
for `gensim` (or you're installing `gensim` from `github <https://github.com/piskvorky/gensim/>`_),
for Gensim (or you're installing Gensim from `Github <https://github.com/piskvorky/gensim/>`_),
you can run::

python setup.py install
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip install .

Testing `gensim`
----------------
Testing Gensim
--------------

To test the package, unzip the `tar.gz source <http://pypi.python.org/pypi/gensim>`_ and run::

python setup.py test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox -e {py27,py35,py36}-{win,linux}

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox -e <PYTHON_VERSION>-<OS_VERSION>, for example, if I want to run tests on Linux with python3.6, I should run tox -e py36-win


To test the package, unzip the `tar.gz source <http://pypi.python.org/pypi/gensim>`_ and run::

python setup.py test

Gensim uses Travis CI for continuous integration: |Travis|_
Gensim uses Travis CI for continuous integration, automatically running the full test suite on each pull request and commit: |Travis|_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition

.. |Travis| image:: https://api.travis-ci.org/piskvorky/gensim.png?branch=develop
.. _Travis: https://travis-ci.org/piskvorky/gensim
.. |Travis| image:: https://travis-ci.org/RaRe-Technologies/gensim.svg?branch=develop
.. _Travis: https://travis-ci.org/RaRe-Technologies/gensim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add (convert to from makdown first of course)

[![Conda-forge Build](https://anaconda.org/conda-forge/gensim/badges/version.svg)](https://anaconda.org/conda-forge/gensim)
[![Wheel](https://img.shields.io/pypi/wheel/gensim.svg)](https://pypi.python.org/pypi/gensim)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's really needed or relevant here.

@menshikh-iv
Copy link
Contributor

@piskvorky I added fixes for current files.
Please make last changes for intro.rst

  • merge vector/sparse vector (I see no reason to have both)
  • add BoW vector
  • add more information about our corpus format (some object with __iter__ that yield BoWs).

@menshikh-iv menshikh-iv merged commit 43c01a2 into develop Jun 27, 2018
@menshikh-iv menshikh-iv deleted the update_docs branch June 27, 2018 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Current issue related to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants