Merge pull request #1379 from jstol/develop

Fix doc2vec __init__ documentation
piskvorky · Jun 4, 2017 · 214fff7 · 214fff7
2 parents 7e74d15 + c2636c0
commit 214fff7
Showing 1 changed file with 4 additions and 2 deletions.
diff --git a/gensim/models/doc2vec.py b/gensim/models/doc2vec.py
@@ -565,7 +565,7 @@ def __init__(self, documents=None, dm_mean=None,
         `window` is the maximum distance between the predicted word and context words used for prediction
         within a document.
 
-        `alpha` is the initial learning rate (will linearly drop to zero as training progresses).
+        `alpha` is the initial learning rate (will linearly drop to `min_alpha` as training progresses).
 
         `seed` = for the random number generator.
         Note that for a fully deterministically-reproducible run, you must also limit the model to
@@ -587,10 +587,12 @@ def __init__(self, documents=None, dm_mean=None,
         `iter` = number of iterations (epochs) over the corpus. The default inherited from Word2Vec is 5,
         but values of 10 or 20 are common in published 'Paragraph Vector' experiments.
 
-        `hs` = if 1 (default), hierarchical sampling will be used for model training (else set to 0).
+        `hs` = if 1, hierarchical softmax will be used for model training.
+        If set to 0 (default), and `negative` is non-zero, negative sampling will be used.
 
         `negative` = if > 0, negative sampling will be used, the int for negative
         specifies how many "noise words" should be drawn (usually between 5-20).
+        Default is 5. If set to 0, no negative samping is used.
 
         `dm_mean` = if 0 (default), use the sum of the context word vectors. If 1, use the mean.
         Only applies when dm is used in non-concatenative mode.