[WIP] Sklearn wrapper for RandomProjections Model #1395

chinmayapancholi13 · 2017-06-06T10:50:56Z

This PR creates an sklearn wrapper for Random Projections model.

menshikh-iv · 2017-06-08T06:41:40Z

gensim/sklearn_integration/sklearn_wrapper_gensim_rpmodel.py

+    Base RP module
+    """
+
+    def __init__(self, corpus, id2word=None, num_topics=300):


Please remove corpus argument, you should pass corpus only to fit method

menshikh-iv · 2017-06-08T06:44:32Z

gensim/sklearn_integration/sklearn_wrapper_gensim_rpmodel.py

+    def fit(self, X, y=None):
+        """
+        For fitting corpus into class object.
+        Calls gensim.models.RpModel


Replace doc-string to Fit the model according to the given training data.

piskvorky

Minor code style nitpicks :)

piskvorky · 2017-06-13T09:19:49Z

gensim/sklearn_integration/sklearn_wrapper_gensim_lsimodel.py

@@ -58,7 +58,7 @@ def set_params(self, **parameters):
    def fit(self, X, y=None):
        """
        For fitting corpus into the class object.
-        Calls gensim.model.LsiModel:
+        Calls gensim.models.LsiModel:
        >>>gensim.models.LsiModel(corpus=corpus, num_topics=num_topics, id2word=id2word, chunksize=chunksize, decay=decay, onepass=onepass, power_iters=power_iters, extra_samples=extra_samples)


This documentation line doesn't seem to help -- what are these undefined variables like id2word, chunksize etc?

These params (id2word, chunksize etc) are associated with the LSI model. This change is in the file sklearn_wrapper_gensim_lsimodel.py. Since this change was so small (literally one word in a docstring), I added this change in this PR (PR concerning RP model wrapper) itself.
There is also a similar change for LDA model here. Should I remove these changes from this PR?

piskvorky · 2017-06-13T09:20:18Z

gensim/sklearn_integration/sklearn_wrapper_gensim_rpmodel.py

+#
+# Copyright (C) 2011 Radim Rehurek <radimrehurek@seznam.cz>
+# Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html
+#


Code style: remove #, insert blank line.

Thanks! Updated now.

piskvorky · 2017-06-13T09:21:37Z

gensim/sklearn_integration/sklearn_wrapper_gensim_rpmodel.py

+Scikit learn interface for gensim for easy use of gensim with scikit-learn
+Follows scikit-learn API conventions
+"""
+from gensim import models


Blank line before imports.

Also, block the imports: built-in first, 3rd party second, local package imports last.

Thanks! Updated now.

menshikh-iv · 2017-06-14T07:25:24Z

Same comments as in #1405

menshikh-iv · 2017-06-19T08:49:29Z

gensim/test/test_sklearn_integration.py

+        score = text_rp.score(corpus, data.target)
+        self.assertGreater(score, 0.40)
+
+    def testPersistence(self):


Same as LdaSeq

Thanks. Done.

menshikh-iv · 2017-06-19T08:49:51Z

gensim/test/test_sklearn_integration.py

+        text_rp = Pipeline((('features', model,), ('classifier', clf)))
+        text_rp.fit(corpus, data.target)
+        score = text_rp.score(corpus, data.target)
+        self.assertGreater(score, 0.40)


same as LdaSeq

Thanks. Done.

chinmayapancholi13 added 2 commits June 6, 2017 03:41

created new file for rpmodel_sklearn_wrapper

0c5bcb0

updated get_params, set_params functions

0810428

chinmayapancholi13 changed the title ~~Sklearn wrapper for RandomProjections Model~~ [WIP] Sklearn wrapper for RandomProjections Model Jun 6, 2017

chinmayapancholi13 added 5 commits June 6, 2017 19:25

correction in calling init function

d67f047

added fit, transform, partial_fit function

a9ce401

added tests for Rp model's sklearn wrapper

05ad743

minor correction in docstring in LDA and LSI models

f1b9c4a

added newline before class definition (PEP8)

8696e54

menshikh-iv reviewed Jun 8, 2017

View reviewed changes

chinmayapancholi13 added 3 commits June 8, 2017 06:08

removed 'corpus' from 'init' and set 'corpus' in 'fit'

fe2f947

updated docstring for 'fit' function

7317173

refactored code to use 'self.model'

692be88

piskvorky reviewed Jun 13, 2017

View reviewed changes

code style changes

a2ec746

chinmayapancholi13 added 13 commits June 14, 2017 00:38

refactored wrapper and tests

954715e

removed 'self.corpus' attribute and refactored slightly

6c3b819

updated 'self.__model' to 'self.gensim_model'

aee04ff

updated test data

a73dacc

updated 'fit' and 'transform' methods

da602d9

updated 'testTransform' test

c1087ac

PEP8 change

00f5336

updated 'testTransform' test

376959d

added 'NotFittedError' in 'transform' function

9c888d6

added 'testPersistence' and 'testModelNotFitted' tests

373c36c

added input 'docs' description in 'transform' function

f3c3601

added 'testPipeline' test

ab90b68

replaced 'text_lda' variable with 'text_rp'

928c7f2

menshikh-iv suggested changes Jun 19, 2017

View reviewed changes

chinmayapancholi13 and others added 3 commits June 19, 2017 03:18

updated 'testPersistence' test

cf13c9a

set fixed seed in 'testPipeline' test

cde12f2

Merge branch 'develop' into rp_wrapper_scikitlearn

26cd2df

menshikh-iv merged commit 2ea1af0 into piskvorky:develop Jun 20, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Sklearn wrapper for RandomProjections Model #1395

[WIP] Sklearn wrapper for RandomProjections Model #1395

chinmayapancholi13 commented Jun 6, 2017

menshikh-iv Jun 8, 2017

menshikh-iv Jun 8, 2017

piskvorky left a comment

piskvorky Jun 13, 2017

chinmayapancholi13 Jun 13, 2017

piskvorky Jun 13, 2017

chinmayapancholi13 Jun 13, 2017

piskvorky Jun 13, 2017

chinmayapancholi13 Jun 13, 2017

menshikh-iv commented Jun 14, 2017

menshikh-iv Jun 19, 2017

chinmayapancholi13 Jun 19, 2017

menshikh-iv Jun 19, 2017

chinmayapancholi13 Jun 19, 2017

[WIP] Sklearn wrapper for RandomProjections Model #1395

[WIP] Sklearn wrapper for RandomProjections Model #1395

Conversation

chinmayapancholi13 commented Jun 6, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piskvorky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

menshikh-iv commented Jun 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment