diff --git a/docs/src/about.rst b/docs/src/about.rst index 64a65bd333..25194b9404 100644 --- a/docs/src/about.rst +++ b/docs/src/about.rst @@ -2,12 +2,12 @@ .. _about: -============ +===== About -============ +===== History --------- +------- Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library `dml.cz `_ in 2008, where it served to generate a short list of the most similar articles to a given article (**gensim = "generate similar"**). @@ -15,19 +15,18 @@ I also wanted to try these fancy "Latent Semantic Methods", but the libraries th realized the necessary computation were `not much fun to work with `_. Naturally, I set out to reinvent the wheel. Our `2010 LREC publication `_ -describes the initial design decisions behind gensim (clarity, efficiency and scalability) -and is fairly representative of how gensim works even today. +describes the initial design decisions behind Gensim: clarity, efficiency and scalability. It is fairly representative of how Gensim works even today. Later versions of gensim improved this efficiency and scalability tremendously. In fact, I made algorithmic scalability of distributional semantics the topic of my `PhD thesis `_. -By now, gensim is---to my knowledge---the most robust, efficient and hassle-free piece +By now, Gensim is---to my knowledge---the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run "hello world". In 2011, I started using `Github `_ for source code hosting -and the gensim website moved to its present domain. In 2013, gensim got its current logo and website design. +and the Gensim website moved to its present domain. In 2013, Gensim got its current logo and website design. Licensing @@ -35,39 +34,40 @@ Licensing Gensim is licensed under the OSI-approved `GNU LGPLv2.1 license `_. This means that it's free for both personal and commercial use, but if you make any -modification to gensim that you distribute to other people, you have to disclose +modification to Gensim that you distribute to other people, you have to disclose the source code of these modifications. -Apart from that, you are free to redistribute gensim in any way you like, though you're +Apart from that, you are free to redistribute Gensim in any way you like, though you're not allowed to modify its license (doh!). -My intent here is, of course, to **get more help and community involvement** with the development of gensim. +My intent here is to **get more help and community involvement** with the development of Gensim. The legalese is therefore less important to me than your input and contributions. -Contact me if LGPL doesn't fit your bill but you'd still like to use gensim -- we'll work something out. + +`Contact me `_ if LGPL doesn't fit your bill and you'd like the open source restrictions lifted. .. seealso:: - I also host a document similarity package `gensim.simserver`. This is a high-level - interface to `gensim` functionality, and offers transactional remote (web-based) - document similarity queries and indexing. It uses gensim to do the heavy lifting: - you don't need the `simserver` to use gensim, but you do need gensim to use the `simserver`. - Note that unlike gensim, `gensim.simserver` is licensed under `Affero GPL `_, - which is much more restrictive for inclusion in commercial projects. + We also built a high performance commercial server for NLP, document analysis, indexing, search and clustering: https://scaletext.ai. ScaleText is available both on-prem and as SaaS. + + Reach out at info@scaletext.com if you need an industry-grade NLP tool with professional support. + Contributors --------------- +------------ -Credit goes to all the people who contributed to gensim, be it in `discussions `_, +Credit goes to the many people who contributed to Gensim, be it in `discussions `_, ideas, `code contributions `_ or `bug reports `_. + It's really useful and motivating to get feedback, in any shape or form, so big thanks to you all! Some honorable mentions are included in the `CHANGELOG.txt `_. Academic citing ----------------- +--------------- -Gensim has been used in `many students' final theses as well as research papers `_. When citing gensim, -please use `this BibTeX entry `_:: +Gensim has been used in `over a thousand research paper and student theses `_. + +When citing Gensim, please use `this BibTeX entry `_:: @inproceedings{rehurek_lrec, title = {{Software Framework for Topic Modelling with Large Corpora}}, @@ -83,5 +83,3 @@ please use `this BibTeX entry `_:: note={\url{http://is.muni.cz/publication/884893/en}}, language={English} } - - diff --git a/docs/src/distributed.rst b/docs/src/distributed.rst index 38f243222f..8b27fedc4f 100644 --- a/docs/src/distributed.rst +++ b/docs/src/distributed.rst @@ -1,7 +1,7 @@ .. _distributed: Distributed Computing -=================================== +===================== Why distributed computing? --------------------------- @@ -37,20 +37,20 @@ Prerequisites For communication between nodes, `gensim` uses `Pyro (PYthon Remote Objects) `_, version >= 4.27. This is a library for low-level socket communication -and remote procedure calls (RPC) in Python. `Pyro` is a pure-Python library, so its +and remote procedure calls (RPC) in Python. `Pyro4` is a pure-Python library, so its installation is quite painless and only involves copying its `*.py` files somewhere onto your Python's import path:: - sudo easy_install Pyro4 + pip install Pyro4 -You don't have to install `Pyro` to run `gensim`, but if you don't, you won't be able +You don't have to install Pyro to run Gensim, but if you don't, you won't be able to access the distributed features (i.e., everything will always run in serial mode, the examples on this page don't apply). Core concepts ------------------------------------ +------------- -As always, `gensim` strives for a clear and straightforward API (see :ref:`design`). +As always, Gensim strives for a clear and straightforward API (see :ref:`design`). To this end, *you do not need to make any changes in your code at all* in order to run it over a cluster of computers! diff --git a/docs/src/gensim_theme/layout.html b/docs/src/gensim_theme/layout.html index feac733791..4b4bd9fc43 100644 --- a/docs/src/gensim_theme/layout.html +++ b/docs/src/gensim_theme/layout.html @@ -174,7 +174,7 @@

Get Expert Help From The Gensim Authors

diff --git a/docs/src/install.rst b/docs/src/install.rst index 69d101e430..61039cb1d8 100644 --- a/docs/src/install.rst +++ b/docs/src/install.rst @@ -7,120 +7,59 @@ Installation Quick install -------------- -Run in your terminal:: - - easy_install -U gensim - -or, alternatively:: +Run in your terminal (recommended):: pip install --upgrade gensim -In case that fails, make sure you're installing into a writeable location (or use `sudo`), or read on. - ------ - -Dependencies -------------- -Gensim is known to run on Linux, Windows and Mac OS X and should run on any other -platform that supports Python 2.6+ and NumPy. Gensim depends on the following software: - -* `Python `_ >= 2.6. Tested with versions 2.6, 2.7, 3.3, 3.4 and 3.5. Support for Python 2.5 was discontinued starting gensim 0.10.0; if you *must* use Python 2.5, install gensim 0.9.1. -* `NumPy `_ >= 1.3. Tested with version 1.9.0, 1.7.1, 1.7.0, 1.6.2, 1.6.1rc2, 1.5.0rc1, 1.4.0, 1.3.0, 1.3.0rc2. -* `SciPy `_ >= 0.7. Tested with version 0.14.0, 0.12.0, 0.11.0, 0.10.1, 0.9.0, 0.8.0, 0.8.0b1, 0.7.1, 0.7.0. - -**Windows users** are well advised to try the `Enthought distribution `_, -which conveniently includes Python & NumPy & SciPy in a single bundle, and is free for academic use. - - -Install Python and `easy_install` ---------------------------------- - -Check what version of Python you have with:: - - python --version - -You can download Python from http://python.org/download. - -.. note:: Gensim requires Python 2.6 / 3.3 or greater, and will not run under earlier versions. - -Next, install the `easy_install utility `_, -which will make installing other Python programs easier. - -Install SciPy & NumPy ----------------------- - -These are quite popular Python packages, so chances are there are pre-built binary -distributions available for your platform. You can try installing from source using easy_install:: +or, alternatively for `conda` environments:: - easy_install numpy - easy_install scipy - -If that doesn't work or if you'd rather install using a binary package, consult -http://www.scipy.org/Download. - -Install `gensim` ------------------ - -You can now install (or upgrade) `gensim` with:: - - easy_install --upgrade gensim + conda install -c conda-forge gensim That's it! Congratulations, you can proceed to the :doc:`tutorials `. ------ - -If you also want to run the algorithms over a cluster -of computers, in :doc:`distributed`, you should install with:: - - easy_install gensim[distributed] - -The optional `distributed` feature installs `Pyro (PYthon Remote Objects) `_. -If you don't know what distributed computing means, you can ignore it: -`gensim` will work fine for you anyway. -This optional extension can also be installed separately later with:: - - easy_install Pyro4 +In case that failed, make sure you're installing into a writeable location (or use `sudo`). ----- -There are also alternative routes to install: - -1. If you have downloaded and unzipped the `tar.gz source `_ - for `gensim` (or you're installing `gensim` from `github `_), - you can run:: - - python setup.py install - - to install `gensim` into your ``site-packages`` folder. -2. If you wish to make local changes to the `gensim` code (`gensim` is, after all, a - package which targets research prototyping and modifications), a preferred - way may be installing with:: +Code dependencies +----------------- - python setup.py develop +Gensim runs on Linux, Windows and Mac OS X, and should run on any other +platform that supports Python 2.7+ and NumPy. Gensim depends on the following software: - This will only place a symlink into your ``site-packages`` directory. The actual - files will stay wherever you unpacked them. -3. If you don't have root priviledges (or just don't want to put the package into - your ``site-packages``), simply unpack the source package somewhere and that's it! No - compilation or installation needed. Just don't forget to set your PYTHONPATH - (or modify ``sys.path``), so that Python can find the unpacked package when importing. +* `Python `_ >= 2.7 (tested with versions 2.7, 3.5 and 3.6) +* `NumPy `_ >= 1.11.3 +* `SciPy `_ >= 0.18.1 +* `Six `_ >= 1.5.0 +* `smart_open `_ >= 1.2.1 +Testing Gensim +-------------- -Testing `gensim` ----------------- +Gensim uses continuous integration, automatically running a full test suite on each pull request with -To test the package, unzip the `tar.gz source `_ and run:: ++------------+-----------------------------------------------------------------------------------------+--------------+ +| CI service | Task | Build badge | ++============+=========================================================================================+==============+ +| Travis | Run tests on Linux and check `code-style `_ | |Travis|_ | ++------------+-----------------------------------------------------------------------------------------+--------------+ +| AppVeyor | Run tests on Windows | |AppVeyor|_ | ++------------+-----------------------------------------------------------------------------------------+--------------+ +| CircleCI | Build documentation | |CircleCI|_ | ++------------+-----------------------------------------------------------------------------------------+--------------+ - python setup.py test +.. |Travis| image:: https://travis-ci.org/RaRe-Technologies/gensim.svg?branch=develop +.. _Travis: https://travis-ci.org/RaRe-Technologies/gensim -Gensim uses Travis CI for continuous integration: |Travis|_ +.. |CircleCI| image:: https://circleci.com/gh/RaRe-Technologies/gensim/tree/develop.svg?style=shield +.. _CircleCI: https://circleci.com/gh/RaRe-Technologies/gensim -.. |Travis| image:: https://api.travis-ci.org/piskvorky/gensim.png?branch=develop -.. _Travis: https://travis-ci.org/piskvorky/gensim +.. |AppVeyor| image:: https://ci.appveyor.com/api/projects/status/r2au32ucpn8gr0tl/branch/develop?svg=true +.. _AppVeyor: https://ci.appveyor.com/api/projects/status/r2au32ucpn8gr0tl/branch/develop?svg=true Problems? --------- -Use the `gensim discussion group `_ for -questions and troubleshooting. See the :doc:`support page `. +Use the `Gensim discussion group `_ for +questions and troubleshooting. See the :doc:`support page ` for commercial support. diff --git a/docs/src/intro.rst b/docs/src/intro.rst index 3ffc724267..bcb60efa27 100644 --- a/docs/src/intro.rst +++ b/docs/src/intro.rst @@ -7,16 +7,15 @@ Introduction Gensim is a :ref:`free ` Python library designed to automatically extract semantic topics from documents, as efficiently (computer-wise) and painlessly (human-wise) as possible. - Gensim is designed to process raw, unstructured digital texts ("*plain text*"). -The algorithms in `gensim`, such as **Latent Semantic Analysis**, **Latent Dirichlet Allocation** and **Random Projections** -discover semantic structure of documents by examining statistical -co-occurrence patterns of the words within a corpus of training documents. -These algorithms are unsupervised, which means no human input is necessary -- you only need a corpus of plain text documents. -Once these statistical patterns are found, any plain text documents can be succinctly -expressed in the new, semantic representation and queried for topical similarity -against other documents. +The algorithms in Gensim, such as :class:`~gensim.models.word2vec.Word2Vec`, :class:`~gensim.models.fasttext.FastText`, +Latent Semantic Analysis (LSI, LSA, see :class:`~gensim.models.lsimodel.LsiModel`), Latent Dirichlet +Allocation (LDA, see :class:`~gensim.models.ldamodel.LdaModel`) etc, automatically discover the semantic structure of documents by examining statistical +co-occurrence patterns within a corpus of training documents. These algorithms are **unsupervised**, +which means no human input is necessary -- you only need a corpus of plain text documents. + +Once these statistical patterns are found, any plain text documents (sentence, phrase, word…) can be succinctly expressed in the new, semantic representation and queried for topical similarity against other documents (words, phrases…). .. note:: If the previous paragraphs left you confused, you can read more about the `Vector @@ -27,69 +26,71 @@ against other documents. .. _design: Features ------------------- +-------- * **Memory independence** -- there is no need for the whole training corpus to reside fully in RAM at any one time (can process large, web-scale corpora). +* **Memory sharing** -- trained models can be persisted to disk and loaded back via mmap. Multiple processes can share the same data, cutting down RAM footprint. * Efficient implementations for several popular vector space algorithms, - including **Tf-Idf**, distributed incremental **Latent Semantic Analysis**, - distributed incremental **Latent Dirichlet Allocation (LDA)** or **Random Projection**; adding new ones is easy (really!). -* I/O wrappers and converters around **several popular data formats**. -* **Similarity queries** for documents in their semantic representation. - -The creation of `gensim` was motivated by a perceived lack of available, scalable software -frameworks that realize topic modelling, and/or their overwhelming internal complexity (hail Java!). -You can read more about the motivation in our `LREC 2010 workshop paper `_. -If you want to cite `gensim` in your own work, please refer to that article (`BibTeX `_). + including :class:`~gensim.models.word2vec.Word2Vec`, :class:`~gensim.models.doc2vec.Doc2Vec`, :class:`~gensim.models.fasttext.FastText`, + TF-IDF, Latent Semantic Analysis (LSI, LSA, see :class:`~gensim.models.lsimodel.LsiModel`), + Latent Dirichlet Allocation (LDA, see :class:`~gensim.models.ldamodel.LdaModel`) or Random Projection (see :class:`~gensim.models.rpmodel.RpModel`). +* I/O wrappers and readers from several popular data formats. +* Fast similarity queries for documents in their semantic representation. -You're welcome to share your results and experiments on the `mailing list `_. +The **principal design objectives** behind Gensim are: -The **principal design objectives** behind `gensim` are: - -1. Straightforward interfaces and low API learning curve for developers. Good - for prototyping. +1. Straightforward interfaces and low API learning curve for developers. Good for prototyping. 2. Memory independence with respect to the size of the input corpus; all intermediate steps and algorithms operate in a streaming fashion, accessing one document at a time. .. seealso:: - If you're interested in document indexing/similarity retrieval, I also maintain a higher-level package - of `document similarity server `_. It uses `gensim` internally. + We also built a high performance commercial server for NLP, document analysis, indexing, search and clustering: https://scaletext.ai. ScaleText is available both on-prem and as SaaS. + + Reach out at info@scaletext.com if you need an industry-grade NLP tool with professional support. + .. _availability: Availability ------------ -Gensim is licensed under the OSI-approved `GNU LGPLv2.1 license `_ -and can be downloaded either from its `github repository `_ +Gensim is licensed under the OSI-approved `GNU LGPLv2.1 license `_ and can be downloaded either from its `Github repository `_ or from the `Python Package Index `_. .. seealso:: - See the :doc:`install ` page for more info on `gensim` deployment. + See the :doc:`install ` page for more info on Gensim deployment. Core concepts ------------- -The whole `gensim` package revolves around the concepts of :term:`corpus`, :term:`vector` and -:term:`model`. - .. glossary:: Corpus - A collection of digital documents. This collection is used to automatically - infer the structure of the documents, their topics, etc. For - this reason, the collection is also called a *training corpus*. This inferred - latent structure can be later used to assign topics to new documents, which did - not appear in the training corpus. - No human intervention (such as tagging the documents by hand, or creating - other metadata) is required. - - Vector - In the Vector Space Model (VSM), each document is represented by an + A collection of digital documents. Corpora serve two roles in Gensim: + + 1. Input for model training + The corpus is used to automatically train a machine learning model, such as + :class:`~gensim.models.lsimodel.LsiModel` or :class:`~gensim.models.ldamodel.LdaModel`. + + The models use this *training corpus* to look for common themes and topics, initializing + their internal model parameters. + + Gensim in unique in its focus on *unsupervised* models so that no human intervention, + such as costly annotations or tagging documents by hand, is required. + + 2. Documents to organize. + After training, a topic model can be used to extract topics from new documents (documents + not seen in the training corpus). + + Such corpora can be :doc:`indexed `, queried by semantic similarity, clustered etc. + + Vector space model + In a Vector Space Model (VSM), each document is represented by an array of features. For example, a single feature may be thought of as a question-answer pair: @@ -100,10 +101,13 @@ The whole `gensim` package revolves around the concepts of :term:`corpus`, :term The question is usually represented only by its integer id (such as `1`, `2` and `3` here), so that the representation of this document becomes a series of pairs like ``(1, 0.0), (2, 2.0), (3, 5.0)``. + If we know all the questions in advance, we may leave them implicit and simply write ``(0.0, 2.0, 5.0)``. - This sequence of answers can be thought of as a *vector* (in this case a 3-dimensional vector). For practical purposes, only questions to which the answer is (or - can be converted to) a single real number are allowed. + + This sequence of answers can be thought of as a **vector** (in this case a 3-dimensional dense vector). + For practical purposes, only questions to which the answer is (or + can be converted to) a *single floating point number* are allowed in Gensim. The questions are the same for each document, so that looking at two vectors (representing two documents), we will hopefully be able to make @@ -111,36 +115,46 @@ The whole `gensim` package revolves around the concepts of :term:`corpus`, :term therefore the original documents must be similar, too". Of course, whether such conclusions correspond to reality depends on how well we picked our questions. - Sparse Vector - Typically, the answer to most questions will be ``0.0``. To save space, - we omit them from the document's representation, and write only ``(2, 2.0), - (3, 5.0)`` (note the missing ``(1, 0.0)``). - Since the set of all questions is known in advance, all the missing features - in a sparse representation of a document can be unambiguously resolved to zero, ``0.0``. + Gensim sparse vector, Bag-of-words vector + To save space, in Gensim we omit all vector elements with value 0.0. For example, instead of the + 3-dimensional dense vector ``(0.0, 2.0, 5.0)``, we write only ``[(2, 2.0), (3, 5.0)]`` (note the missing ``(1, 0.0)``). Each vector element is a pair (2-tuple) of ``(feature_id, feature_value)``. The values of all missing features in this sparse representation can be unambiguously resolved to zero, ``0.0``. + + Documents in Gensim are represented by such sparse vectors (sometimes called bag-of-words vectors). + + Gensim streamed corpus + Gensim does not prescribe any specific corpus format. A corpus is simply a sequence + of sparse vector (see above). + + For example, ``[ [(2, 2.0), (3, 5.0)], [(3, 1.0)] ]`` + is a simple corpus of two documents = two sparse vectors: the first with two non-zero elements, + the second with one non-zero element. This particular corpus is represented as a plain Python ``list``. + + However, the full power of Gensim comes from the fact that a corpus doesn't have to be a ``list``, + or a ``NumPy`` array, or a ``Pandas`` dataframe, or whatever. Gensim *accepts any object that, + when iterated over, successively yields these sparse bag-of-word vectors*. + + This flexibility allows you to create your own corpus classes that stream the sparse vectors directly from disk, network, database, dataframes…. The models in Gensim are implemented such that they don't require all vectors to reside in RAM at once. You can even create the sparse vectors on the fly! - Gensim does not prescribe any specific corpus format; - a corpus is anything that, when iterated over, successively yields these sparse vectors. - For example, `set((((2, 2.0), (3, 5.0)), ((0, 1.0), (3, 1.0))))` is a trivial - corpus of two documents, each with two non-zero `feature-answer` pairs. + See our `tutorial on streamed data processing in Python `_. + For a built-in example of an efficient corpus format streamed directly from disk, see + the Matrix Market format in :mod:`~gensim.corpora.mmcorpus`. For a minimal blueprint example on + how to create your own streamed corpora, check out the `source code of CSV corpus `_. + Model, Transformation + Gensim uses **model** to refer to the code and associated data (model parameters) + required to transform one document representation to another. - Model - We use **model** as an abstract term referring to a transformation from - one document representation to another. In `gensim` documents are - represented as vectors so a model can be thought of as a transformation - between two vector spaces. The details of this transformation are - learned from the training corpus. + In Gensim, documents are represented as vectors (see above) so a model can be thought of as a transformation + from one vector space to another. The parameters of this transformation are learned from the training corpus. + Trained models (the data parameters) can be persisted to disk and later loaded back, either to continue + training on new training documents or to transform new documents. - For example, consider a transformation that takes a raw count of word - occurrences and weights them so that common words are discounted and - rare words are promoted. The exact amount that any particular word is - weighted by is determined by the relative frequency of that word in the - training corpus. When we apply this model we transform from one vector - space (containing the raw word counts) to another (containing the - weighted counts). + Gensim implements multiple models, such as :class:`~gensim.models.word2vec.Word2Vec`, + :class:`~gensim.models.lsimodel.LsiModel`, :class:`~gensim.models.ldamodel.LdaModel`, + :class:`~gensim.models.fasttext.FastText` etc. See the :doc:`API reference ` for a full list. .. seealso:: - For some examples on how this works out in code, go to :doc:`tutorials `. + For some examples on how all this works out in code, go to :doc:`Tutorials `. diff --git a/docs/src/models/doc2vec.rst b/docs/src/models/doc2vec.rst index f2da8bb722..b5d2e290b5 100644 --- a/docs/src/models/doc2vec.rst +++ b/docs/src/models/doc2vec.rst @@ -1,8 +1,8 @@ -:mod:`models.doc2vec` -- Deep learning with paragraph2vec -========================================================= +:mod:`models.doc2vec` -- Doc2vec paragraph embeddings +===================================================== .. automodule:: gensim.models.doc2vec - :synopsis: Deep learning with doc2vec + :synopsis: Doc2vec paragraph embeddings :members: :inherited-members: :undoc-members: diff --git a/docs/src/models/word2vec.rst b/docs/src/models/word2vec.rst index 1679429e22..62117c1a6b 100644 --- a/docs/src/models/word2vec.rst +++ b/docs/src/models/word2vec.rst @@ -1,8 +1,8 @@ -:mod:`models.word2vec` -- Deep learning with word2vec -====================================================== +:mod:`models.word2vec` -- Word2vec embeddings +============================================= .. automodule:: gensim.models.word2vec - :synopsis: Deep learning with word2vec + :synopsis: Word2vec embeddings :members: :inherited-members: :undoc-members: diff --git a/docs/src/support.rst b/docs/src/support.rst index a9f83e1380..44b213b865 100644 --- a/docs/src/support.rst +++ b/docs/src/support.rst @@ -1,18 +1,21 @@ .. _support: -============= +======= Support -============= +======= Open source support --------------------- +------------------- + +The main communication channel is the `Gensim mailing list `_. + +Additional channels are `twitter @gensim_py `_ and `Gitter piskvorky/gensim `_. -The main communication channel is the `gensim mailing list `_. This is the preferred way to **ask for help**, **report problems** and **share insights** with the community. Newbie questions are perfectly fine, just make sure you've read the :doc:`tutorials `. -I discourage sending private emails, because the mailing list serves as a knowledge base for all gensim users, cutting maintenance efforts needed for support. If you feel your problem is too special, data too sensitive, technical scope too demanding, **see the "business" section below**. +I discourage sending private emails, because the mailing list serves as a knowledge base for all Gensim users, cutting maintenance efforts needed for support. If you feel your problem is too special, data too sensitive, technical scope too demanding, **see the "business" section below**. -When posting on the mailing list, try to include all relevant information, such as what it is you are trying to achieve, what went wrong, relevant gensim logs, package versions etc. +When posting on the mailing list, try to include all relevant information, such as what it is you are trying to achieve, what went wrong, relevant Gensim logs, package versions etc. **FAQ** and some useful **snippets of code** are maintained on GitHub: https://github.com/piskvorky/gensim/wiki/Recipes-&-FAQ. @@ -20,14 +23,16 @@ You can also try asking on StackOverflow, using the `gensim tag `_. +We run a consulting R&D company focused on data mining and unstructured text processing, https://rare-technologies.com. + +If you need commercial support, design validation, technical training or custom system development, `get in touch `_ for a quote. -In case you need commercial support, design validation, technical training or custom system development, `get in touch `_ for a quote. Developer support ------------------ -Developers who `tweak gensim internals `_ are encouraged to report issues at the `GitHub issue tracker `_. -Note that this is not a medium for discussions or asking open-ended questions; please use the mailing list for that. +Developers who `tweak Gensim internals `_ are encouraged to report issues at the `GitHub issue tracker `_. + +Note that Github is not a medium for discussions or asking open-ended questions; please use the `mailing list `_ for that.