Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OWL-QN method (L-BFGS with L1-regularization) #244

Merged
merged 19 commits into from
Aug 17, 2022

Conversation

vbkaisetsu
Copy link
Contributor

Fix #243

This PR adds the L1-regularization option to LBFGS.
This option enables L1-regularization performed by the OWL-QN method.

@stefan-k stefan-k self-requested a review August 12, 2022 13:05
@stefan-k
Copy link
Member

Thanks a lot for this PR! I will try to review it during the weekend (but unfortunately I can't guarantee it).

At first sight I was wondering if it would be better to have a dedicated OWL-QN solver instead of adding it to the existing L-BFGS. This would lead to code duplication but it may also be easier to understand and to maintain in the long run (and maybe it would be also better in terms of performance). But I'm not sure yet about what's the best approach. I'd be interested in your opinion on this.

@vbkaisetsu
Copy link
Contributor Author

I also initially thought it would be better to define a separate structure.
However, on second thought, OWL-QN only adds a few processing to L-BFGS, and a separate structure would incur a high administrative cost since changes to L-BFGS would have to be reflected in OWL-QN every time.

Copy link
Member

@stefan-k stefan-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this valuable contribution! I did a quick initial review with mostly minor style-related comments. Could you also provide an example owl_qn.rs in the examples folder, please? This would help me to better understand what is going on and it would of course also be useful for users.

argmin/src/core/test_utils.rs Outdated Show resolved Hide resolved
argmin/src/solver/quasinewton/lbfgs.rs Outdated Show resolved Hide resolved
argmin/src/solver/quasinewton/lbfgs.rs Outdated Show resolved Hide resolved
argmin/src/solver/quasinewton/lbfgs.rs Outdated Show resolved Hide resolved
argmin/src/solver/quasinewton/lbfgs.rs Outdated Show resolved Hide resolved
@stefan-k
Copy link
Member

However, on second thought, OWL-QN only adds a few processing to L-BFGS, and a separate structure would incur a high administrative cost since changes to L-BFGS would have to be reflected in OWL-QN every time.

Yes, after having a more detailed look at the code I tend to agree. The only downside is the increased number of math-related trait bounds which add an additional implementation burden on people using their own types. But I think this is acceptable for now. We can reconsider this if it starts to become a problem.

@vbkaisetsu
Copy link
Contributor Author

Thank you for your review. I fixed them.
Also, I fixed several bugs. The pseudo-gradient should also be used in line search, but the previous implementation used the normal gradient.
I added the L1-regularization version of Rosenbrock. Comparing the results with the following grid search results, I think the results are correct.

# Rosenbrock w/o regularization
best = (0.0, 0.0, float('inf'))
for x in np.arange(0, 2, 0.001):
    for y in np.arange(0, 2, 0.001):
        z = (1.0 - x)**2 + 100 * (y - x**2)**2
        if z < best[2]:
            best = (x, y, z)
print(best)
#=> (1.0, 1.0, 0.0)

# Rosenbrock w/ L1 regularization (coeff = 1.0)
best = (0.0, 0.0, float('inf'))
for x in np.arange(0, 2, 0.001):
    for y in np.arange(0, 2, 0.001):
        z = (1.0 - x)**2 + 100 * (y - x**2)**2 + np.abs(x) + np.abs(y)
        if z < best[2]:
            best = (x, y, z)
print(best)
#=> (0.249, 0.057, 0.8725020001)

TODO: Add the L1 term to the cost function of LineSearchProblem.

if let Some(xi) = self.xi.as_ref() {
let zeros = param.zero_like();
let param = P::max(&param.mul(xi).signum(), &zeros).mul(param);
self.problem.cost(&param)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefan-k This line should be as follows:

Suggested change
self.problem.cost(&param)
self.problem.cost(&param) + self.l1_coeff.unwrap() * param.l1_norm()

Should I add the l1_norm() function to the ArgminNorm trait? Or should I create a new trait for the L1 norm?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a dedicated trait ArgminL1Norm.
But then ArgminNorm should be renamed to ArgminL2Norm. If you want, you can also make that change, but it is absolutely fine if you don't want to do it because it is quite some work. I can also make that change after merging this :)

Copy link
Member

@stefan-k stefan-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that you had to wait! I can't keep up with your impressive pace.
Looks really good, thanks a lot for all the effort, I really appreciate it!

There is only one minor thing. Unfortunately I can't place a comment at the line that I'm talking about, so I'll have to describe it. Could you please add a small text about how this can be turned into an OWL-QN in the docs of the LBFGS struct, right above the "TODO" with the headline ## OWL-QN? This way users will be able to more easily make the link between the L1 regularization and OWL-QN.

After that I'd only ask you to squash your commits a bit and then I think this is ready to go! :)

@vbkaisetsu
Copy link
Contributor Author

Thank you for your review. I updated the documentation.

Copy link
Member

@stefan-k stefan-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Thanks once again for all your work, I really appreciate this great addition to the library! :)

@stefan-k stefan-k merged commit ac0b42d into argmin-rs:main Aug 17, 2022
@vbkaisetsu vbkaisetsu deleted the owl-qn branch August 17, 2022 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support OWL-QN (L-BFGS with L1 regularization)
2 participants