Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] sgd: Change deprecated n_iter to max_iter #2920

Merged
merged 6 commits into from
Feb 23, 2018
Merged

Conversation

smiks
Copy link
Contributor

@smiks smiks commented Feb 21, 2018

Issue

SGDClassificationLearner deprecated parameter (n_iter). Fixes #2917.

Description of changes

Deprecation fix: changed n_iter to max_iter (SGDClassifier)

Includes
  • Code changes
  • Tests
  • Documentation

@CLAassistant
Copy link

CLAassistant commented Feb 21, 2018

CLA assistant check
All committers have signed the CLA.

@@ -76,7 +76,7 @@ class Outputs(OWBaseLearner.Outputs):
learning_rate_index = Setting(0)
eta0 = Setting(.01)
power_t = Setting(.25)
n_iter = Setting(5)
max_iter = Setting(10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default in sklearn is 5. Is there a particular reason to have it at 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason. Sklearn will change it to 1000 from 0.21. I decided for 10 only to see if I changed everything correctly (testing widget in Orange).

@@ -76,7 +76,7 @@ class Outputs(OWBaseLearner.Outputs):
learning_rate_index = Setting(0)
eta0 = Setting(.01)
power_t = Setting(.25)
n_iter = Setting(5)
max_iter = Setting(10)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please implement a migrate_settings method. Without it, old workflows won't use the old n_iter value as the max_iter value while their semantics are essentially the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to do it.
Is this correct (added in class OWSGD)?

    @classmethod
    def migrate_settings(cls, settings_, version):
        if version < 2:
            settings_["max_iter"] = settings_.get("n_iter", 5)

Copy link
Collaborator

@pavlin-policar pavlin-policar Feb 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should do the trick. Perhaps delete the setting n_iter as well to clean up the old settings? You can do this in one line settings_["max_iter"] = settings_.pop("n_iter", 5).

@codecov-io
Copy link

codecov-io commented Feb 21, 2018

Codecov Report

Merging #2920 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##           master   #2920      +/-   ##
=========================================
+ Coverage    82.1%   82.1%   +<.01%     
=========================================
  Files         328     328              
  Lines       56262   56265       +3     
=========================================
+ Hits        46194   46198       +4     
+ Misses      10068   10067       -1

Copy link
Contributor

@lanzagar lanzagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. Looks good, just 2 things:

  1. SGDRegressionLearner should also be fixed in the same way (we have a common widget for both). It is in Orange/regression/linear.py
  2. We should probably add a control in the widget to set the tol parameter. It is not critical for this PR since the previous functionality is still the same. But if you don't add it now, please open an Issue that we should consider adding it before sklearn 0.21 (seems setting tol instead of max_iter will be the preferred way in the fututre).

(Also, the linting and change default back commits should be squashed into the first one at the end)

@lanzagar lanzagar changed the title Deprecation fix: changed n_iter to max_iter (SGDClassifier) [FIX] sgd: Change deprecated n_iter to max_iter Feb 22, 2018
@smiks
Copy link
Contributor Author

smiks commented Feb 22, 2018

I'll try to do both (tol parameter and fix for SGDRegressionLearner).
I only have to fix learner (widget is ok - in case I don't add tol parameter), right?

@lanzagar
Copy link
Contributor

Yes, the SGD widget calls the regression or classification learner depending on the type of data it gets.
It now already uses max_iter, so just the regression SGD class needs to be updated to match it.

… with sklearn 0.21 changes (tol and max_iter)
@lanzagar lanzagar self-assigned this Feb 23, 2018
@lanzagar
Copy link
Contributor

It was hard to fall back to the old functionality (of setting just the number of iterations) without being able to set tol to None - I guess that is why you also removed migrate settings? I added a checkbox for tol and brought back migration.
You can double check my commit to make sure everything is still OK, then I think this is ready for merging.

@smiks
Copy link
Contributor Author

smiks commented Feb 23, 2018

I'll check it. It is my first PR for Orange so I still have a lot to learn about Orange.

@ajdapretnar
Copy link
Contributor

We really appreciate it!

@smiks
Copy link
Contributor Author

smiks commented Feb 23, 2018

Looks ok. I have also tested it locally with different datasets (iris and my datasets). Tested different settings to see if both parameters work. Looks like it works, otherwise I wouldn't get this message Maximum number of iteration reached before convergence. Consider increasing max_iter to improve the fit.

@lanzagar lanzagar merged commit b7e748f into biolab:master Feb 23, 2018
@smiks
Copy link
Contributor Author

smiks commented Feb 23, 2018

Thanks everyone. I'm happy to have contributed to Orange.

@lanzagar lanzagar added this to the 3.11 milestone Feb 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SGDClassificationLearner deprecated parameter (n_iter)
6 participants