DOC add interval range for parameter of SGDRegressor #28373

bmgdc · 2024-02-06T15:12:31Z

Reference Issues/PRs

I couldn't find a related issue.

What does this implement/fix? Explain your changes.

The screenshot below is straight from SGDRegressor's documentation.

The sentence "Also used to compute the learning rate when set to learning_rate is set to 'optimal'" in, suffers from multiple issues:

It's not correct English.
It's not as informative as it could be with regard to the range of accepted values.
It doesn't match what is seen in SGDClassifier's documentation.

This pull request fixes all of these.

In addition to the above, multiple constraints to the parameters were included in the documentation of the regressor. A couple of opportunistic improvements were also made (in 8262eba).

Any other comments?

github-actions · 2024-02-06T15:13:52Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 06c482d. Link to the linter CI: here}

betatim

Generally looks good to me. Thanks for putting in the effort to fix this!

One comment about adding the ranges to all parameters. All other changes LGTM.

sklearn/linear_model/_stochastic_gradient.py

glemaitre

Could you document epsilon with:

        Values must be in the range `[0.0, inf)`.

sklearn/linear_model/_stochastic_gradient.py

glemaitre

Otherwise LGTM.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

bmgdc · 2024-02-08T11:13:59Z

Could you document epsilon with:

        Values must be in the range `[0.0, inf)`.

Done in b1253fb. Thanks!

bmgdc · 2024-02-08T11:47:03Z

@betatim @glemaitre

Since the commits addressing the last review, I included two more commits:

97d5b4d documents the ranges in SGDOneClassSVM
d181bda fixes the documented range (for your convenience, here's the constraint) for the average parameter in SGDClassifier and includes it for SGDRegressor and SGDOneClassSVM

I committed in this way to make it easily reversible if you decide that's the way to go.

Thank you for looking into this!

betatim · 2024-02-08T12:10:18Z

sklearn/linear_model/_stochastic_gradient.py

@@ -1125,7 +1125,7 @@ class SGDClassifier(BaseSGDClassifier):
        an int greater than 1, averaging will begin once the total number of
        samples seen reaches `average`. So ``average=10`` will begin
        averaging after seeing 10 samples.
-        Integer values must be in the range `[1, n_samples]`.
+        Integer values must be in the range `[0, n_samples]`.


I agree the parameter constraint is [0, n_samples], but I don't understand why. Is using 0 just a different way of saying average=True, as in "start right away"? If yes that is surprising because normally I'd think of 0, in the context of bool as False. So my question is, should we update the constraint to match the doc string (in the text it also says "if set to a value greater than 1")?

Maybe someone with more knowledge can comment.

Looking at the code around, I would correct the parameter validation here because we have a lot of code with something like if self.average > 0.

@betatim @glemaitre Would you like me to revert the commit that updates the docstrings relative to the average parameter (it's isolated in d181bda) and open an issue regarding average constraints and usage?

Indeed this would be easier to handle in a separate PR.

@betatim @glemaitre I opened an issue for this here. Feel free to edit it as you see fit.

betatim · 2024-02-08T12:14:17Z

As far as I can tell the new doc strings you added are consistent with what the parameter checking system uses.

One comment about average that we should resolve.

In general it is easier to not "add on" to an existing PR. In this case it is kind of related but also broadens the scope of the PR (to mroe estimators). I think getting a PR, the author and the reviewers all to converge so that the merge button can be clicked turns out to be quite hard, so adding "divergence" by adding on to a PR tends to prevent convergence :D Not a big deal here especially as this is your first time contributing, more soemthing to consider for the future.

bmgdc · 2024-02-08T12:16:20Z

In general it is easier to not "add on" to an existing PR. In this case it is kind of related but also broadens the scope of the PR (to mroe estimators). I think getting a PR, the author and the reviewers all to converge so that the merge button can be clicked turns out to be quite hard, so adding "divergence" by adding on to a PR tends to prevent convergence :D Not a big deal here especially as this is your first time contributing, more soemthing to consider for the future.

Thank you very much! I'll be mindful of that in the future.

This reverts commit d181bda.

glemaitre

LGTM. Let's tackle the average issue in another PR.

) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Fix SGDRegressor documentation

7793726

github-actions bot added the module:linear_model label Feb 6, 2024

bmgdc added 2 commits February 6, 2024 15:47

Include contraints in other parameters in the documentation

3835468

Remove awkward default value references

8262eba

betatim approved these changes Feb 6, 2024

View reviewed changes

sklearn/linear_model/_stochastic_gradient.py Show resolved Hide resolved

bmgdc added 2 commits February 6, 2024 22:22

Merge branch 'main' into fix-sgdregressor-alpha-documentation

96d59fb

Merge branch 'main' into fix-sgdregressor-alpha-documentation

5cd7a62

glemaitre changed the title ~~Fix SGDRegressor's documentation~~ DOC add interval range for parameter of SGDRegressor Feb 8, 2024

glemaitre self-requested a review February 8, 2024 10:58

github-actions bot added the Documentation label Feb 8, 2024

glemaitre reviewed Feb 8, 2024

View reviewed changes

sklearn/linear_model/_stochastic_gradient.py Outdated Show resolved Hide resolved

glemaitre reviewed Feb 8, 2024

View reviewed changes

bmgdc and others added 3 commits February 8, 2024 11:08

Implement scikit-learn#28373 (comment)

cd02686

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Address scikit-learn#28373 (review)

b1253fb

Address scikit-learn#28373 (comment)

6aa087d

bmgdc and others added 3 commits February 8, 2024 11:14

Merge branch 'main' into fix-sgdregressor-alpha-documentation

9d14b4c

Document default values in SGDOneClassSVM

97d5b4d

Fix/document average range values across the board

d181bda

bmgdc requested review from betatim and glemaitre February 8, 2024 11:47

betatim reviewed Feb 8, 2024

View reviewed changes

bmgdc and others added 2 commits February 8, 2024 17:58

Revert "Fix/document average range values across the board"

d544717

This reverts commit d181bda.

Merge branch 'main' into fix-sgdregressor-alpha-documentation

8a5877e

glemaitre approved these changes Feb 8, 2024

View reviewed changes

bmgdc mentioned this pull request Feb 8, 2024

Apparent mismatch between possible arguments for average in the base stochastic gradient class #28389

Closed

bmgdc added 4 commits February 8, 2024 20:01

Merge branch 'main' into fix-sgdregressor-alpha-documentation

331a0a0

Merge branch 'main' into fix-sgdregressor-alpha-documentation

09d172f

Merge branch 'main' into fix-sgdregressor-alpha-documentation

ccd8b29

Merge branch 'main' into fix-sgdregressor-alpha-documentation

06c482d

betatim approved these changes Feb 9, 2024

View reviewed changes

betatim merged commit 724768e into scikit-learn:main Feb 9, 2024
30 checks passed

bmgdc deleted the fix-sgdregressor-alpha-documentation branch February 9, 2024 17:24

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Feb 10, 2024

DOC add interval range for parameter of SGDRegressor (scikit-learn#28373

6cf261a

) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

LeoGrin pushed a commit to LeoGrin/scikit-learn that referenced this pull request Feb 12, 2024

DOC add interval range for parameter of SGDRegressor (scikit-learn#28373

00af689

) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Feb 13, 2024

DOC add interval range for parameter of SGDRegressor (scikit-learn#28373

a52f128

) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

glemaitre added a commit that referenced this pull request Feb 13, 2024

DOC add interval range for parameter of SGDRegressor (#28373)

36bb01f

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC add interval range for parameter of SGDRegressor #28373

DOC add interval range for parameter of SGDRegressor #28373

bmgdc commented Feb 6, 2024 •

edited

Loading

github-actions bot commented Feb 6, 2024 •

edited

Loading

betatim left a comment

glemaitre left a comment

glemaitre left a comment

bmgdc commented Feb 8, 2024

bmgdc commented Feb 8, 2024

betatim Feb 8, 2024

glemaitre Feb 8, 2024

bmgdc Feb 8, 2024

glemaitre Feb 8, 2024

bmgdc Feb 8, 2024

betatim commented Feb 8, 2024

bmgdc commented Feb 8, 2024

glemaitre left a comment

DOC add interval range for parameter of SGDRegressor #28373

DOC add interval range for parameter of SGDRegressor #28373

Conversation

bmgdc commented Feb 6, 2024 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

github-actions bot commented Feb 6, 2024 • edited Loading

✔️ Linting Passed

betatim left a comment

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

bmgdc commented Feb 8, 2024

bmgdc commented Feb 8, 2024

betatim Feb 8, 2024

Choose a reason for hiding this comment

glemaitre Feb 8, 2024

Choose a reason for hiding this comment

bmgdc Feb 8, 2024

Choose a reason for hiding this comment

glemaitre Feb 8, 2024

Choose a reason for hiding this comment

bmgdc Feb 8, 2024

Choose a reason for hiding this comment

betatim commented Feb 8, 2024

bmgdc commented Feb 8, 2024

glemaitre left a comment

Choose a reason for hiding this comment

bmgdc commented Feb 6, 2024 •

edited

Loading

github-actions bot commented Feb 6, 2024 •

edited

Loading