From 748e729449073571a9fff779669dbfd027205403 Mon Sep 17 00:00:00 2001 From: lgmoneda Date: Sun, 29 Jan 2017 12:26:16 -0200 Subject: [PATCH] Corpora_and_Vector_Spaces tutorial text fix --- docs/notebooks/Corpora_and_Vector_Spaces.ipynb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/notebooks/Corpora_and_Vector_Spaces.ipynb b/docs/notebooks/Corpora_and_Vector_Spaces.ipynb index 85ef49506f..c6f7b6b189 100644 --- a/docs/notebooks/Corpora_and_Vector_Spaces.ipynb +++ b/docs/notebooks/Corpora_and_Vector_Spaces.ipynb @@ -213,7 +213,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The function `doc2bow()` simply counts the number of occurrences of each distinct word, converts the word to its integer word id and returns the result as a sparse vector. The sparse vector `[(0, 1), (1, 1)]` therefore reads: in the document *“Human computer interaction”*, the words computer (id 0) and human (id 1) appear once; the other ten dictionary words appear (implicitly) zero times." + "The function `doc2bow()` simply counts the number of occurrences of each distinct word, converts the word to its integer word id and returns the result as a sparse vector. The sparse vector `[(word_id, 1), (word_id, 1)]` therefore reads: in the document *“Human computer interaction”*, the words *\"computer\"* and *\"human\"*, identified by an integer id given by the built dictionary, appear once; the other ten dictionary words appear (implicitly) zero times. Check their id at the dictionary displayed in the previous cell and see that they match." ] }, { @@ -250,7 +250,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "By now it should be clear that the vector feature with `id=10 stands` for the question “How many times does the word graph appear in the document?” and that the answer is “zero” for the first six documents and “one” for the remaining three. As a matter of fact, we have arrived at exactly the same corpus of vectors as in the [Quick Example](https://radimrehurek.com/gensim/tutorial.html#first-example).\n", + "By now it should be clear that the vector feature with `id=10 stands` for the question “How many times does the word graph appear in the document?” and that the answer is “zero” for the first six documents and “one” for the remaining three. As a matter of fact, we have arrived at exactly the same corpus of vectors as in the [Quick Example](https://radimrehurek.com/gensim/tutorial.html#first-example). If you're running this notebook by your own, the words id may differ, but you should be able to check the consistency between documents comparing their vectors. \n", "\n", "## Corpus Streaming – One Document at a Time\n", "\n", @@ -616,7 +616,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.5.1" + "version": "3.6.0" } }, "nbformat": 4,