Support OWL-QN method (L-BFGS with L1-regularization) #244

vbkaisetsu · 2022-08-12T12:27:06Z

This PR adds the L1-regularization option to LBFGS.
This option enables L1-regularization performed by the OWL-QN method.

stefan-k · 2022-08-12T13:20:40Z

Thanks a lot for this PR! I will try to review it during the weekend (but unfortunately I can't guarantee it).

At first sight I was wondering if it would be better to have a dedicated OWL-QN solver instead of adding it to the existing L-BFGS. This would lead to code duplication but it may also be easier to understand and to maintain in the long run (and maybe it would be also better in terms of performance). But I'm not sure yet about what's the best approach. I'd be interested in your opinion on this.

vbkaisetsu · 2022-08-12T13:40:35Z

I also initially thought it would be better to define a separate structure.
However, on second thought, OWL-QN only adds a few processing to L-BFGS, and a separate structure would incur a high administrative cost since changes to L-BFGS would have to be reflected in OWL-QN every time.

stefan-k

Thanks again for this valuable contribution! I did a quick initial review with mostly minor style-related comments. Could you also provide an example owl_qn.rs in the examples folder, please? This would help me to better understand what is going on and it would of course also be useful for users.

argmin/src/core/test_utils.rs

argmin/src/solver/quasinewton/lbfgs.rs

stefan-k · 2022-08-14T06:35:14Z

However, on second thought, OWL-QN only adds a few processing to L-BFGS, and a separate structure would incur a high administrative cost since changes to L-BFGS would have to be reflected in OWL-QN every time.

Yes, after having a more detailed look at the code I tend to agree. The only downside is the increased number of math-related trait bounds which add an additional implementation burden on people using their own types. But I think this is acceptable for now. We can reconsider this if it starts to become a problem.

vbkaisetsu · 2022-08-14T12:11:11Z

Thank you for your review. I fixed them.
Also, I fixed several bugs. The pseudo-gradient should also be used in line search, but the previous implementation used the normal gradient.
I added the L1-regularization version of Rosenbrock. Comparing the results with the following grid search results, I think the results are correct.

# Rosenbrock w/o regularization
best = (0.0, 0.0, float('inf'))
for x in np.arange(0, 2, 0.001):
    for y in np.arange(0, 2, 0.001):
        z = (1.0 - x)**2 + 100 * (y - x**2)**2
        if z < best[2]:
            best = (x, y, z)
print(best)
#=> (1.0, 1.0, 0.0)

# Rosenbrock w/ L1 regularization (coeff = 1.0)
best = (0.0, 0.0, float('inf'))
for x in np.arange(0, 2, 0.001):
    for y in np.arange(0, 2, 0.001):
        z = (1.0 - x)**2 + 100 * (y - x**2)**2 + np.abs(x) + np.abs(y)
        if z < best[2]:
            best = (x, y, z)
print(best)
#=> (0.249, 0.057, 0.8725020001)

TODO: Add the L1 term to the cost function of LineSearchProblem.

vbkaisetsu · 2022-08-14T12:22:23Z

argmin/src/solver/quasinewton/lbfgs.rs

+        if let Some(xi) = self.xi.as_ref() {
+            let zeros = param.zero_like();
+            let param = P::max(&param.mul(xi).signum(), &zeros).mul(param);
+            self.problem.cost(&param)


@stefan-k This line should be as follows:

Suggested change

self.problem.cost(&param)

self.problem.cost(&param) + self.l1_coeff.unwrap() * param.l1_norm()

Should I add the l1_norm() function to the ArgminNorm trait? Or should I create a new trait for the L1 norm?

I suggest adding a dedicated trait ArgminL1Norm.
But then ArgminNorm should be renamed to ArgminL2Norm. If you want, you can also make that change, but it is absolutely fine if you don't want to do it because it is quite some work. I can also make that change after merging this :)

stefan-k

Sorry that you had to wait! I can't keep up with your impressive pace.
Looks really good, thanks a lot for all the effort, I really appreciate it!

There is only one minor thing. Unfortunately I can't place a comment at the line that I'm talking about, so I'll have to describe it. Could you please add a small text about how this can be turned into an OWL-QN in the docs of the LBFGS struct, right above the "TODO" with the headline ## OWL-QN? This way users will be able to more easily make the link between the L1 regularization and OWL-QN.

After that I'd only ask you to squash your commits a bit and then I think this is ready to go! :)

vbkaisetsu · 2022-08-16T23:48:39Z

Thank you for your review. I updated the documentation.

stefan-k

Excellent! Thanks once again for all your work, I really appreciate this great addition to the library! :)

vbkaisetsu added 8 commits August 12, 2022 19:34

Support L1-regularization in L-BFGS

08654e3

Add missing file

41aa101

Move test problem

24a017a

Add comment

61ba731

Add comment

8da29da

Add minmax and signum to ndarray

6576a1b

fmt

1d7552b

fix

4b81114

stefan-k self-requested a review August 12, 2022 13:05

Fix FP comparison

44ab3c4

stefan-k requested changes Aug 14, 2022

View reviewed changes

vbkaisetsu added 5 commits August 14, 2022 16:28

Remove unnecessary empty line

a0a4ded

Change l1_coeff from F to Option<F>

af35930

Fix bugs

1bc9862

Add OWL-QN example

9b5143c

clippy

7c48b3e

Remove unnecessary blank line

e347c52

vbkaisetsu commented Aug 14, 2022

View reviewed changes

vbkaisetsu added 3 commits August 15, 2022 07:34

Update lbfgs.rs

1732fda

Add ArgminL1Noem trait

3c29784

fix

2684578

vbkaisetsu requested a review from stefan-k August 15, 2022 00:58

stefan-k requested changes Aug 16, 2022

View reviewed changes

Update doc

b5ee15c

vbkaisetsu requested a review from stefan-k August 16, 2022 23:48

stefan-k approved these changes Aug 17, 2022

View reviewed changes

stefan-k merged commit ac0b42d into argmin-rs:main Aug 17, 2022

vbkaisetsu deleted the owl-qn branch August 17, 2022 06:45

stefan-k mentioned this pull request Sep 2, 2022

L-BFGS not working with nalgebra backend in argmin 0.7 #257

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support OWL-QN method (L-BFGS with L1-regularization) #244

Support OWL-QN method (L-BFGS with L1-regularization) #244

vbkaisetsu commented Aug 12, 2022

stefan-k commented Aug 12, 2022

vbkaisetsu commented Aug 12, 2022

stefan-k left a comment

stefan-k commented Aug 14, 2022

vbkaisetsu commented Aug 14, 2022

vbkaisetsu Aug 14, 2022

stefan-k Aug 14, 2022

stefan-k left a comment

vbkaisetsu commented Aug 16, 2022

stefan-k left a comment

	self.problem.cost(&param)
	self.problem.cost(&param) + self.l1_coeff.unwrap() * param.l1_norm()

Support OWL-QN method (L-BFGS with L1-regularization) #244

Support OWL-QN method (L-BFGS with L1-regularization) #244

Conversation

vbkaisetsu commented Aug 12, 2022

stefan-k commented Aug 12, 2022

vbkaisetsu commented Aug 12, 2022

stefan-k left a comment

Choose a reason for hiding this comment

stefan-k commented Aug 14, 2022

vbkaisetsu commented Aug 14, 2022

vbkaisetsu Aug 14, 2022

Choose a reason for hiding this comment

stefan-k Aug 14, 2022

Choose a reason for hiding this comment

stefan-k left a comment

Choose a reason for hiding this comment

vbkaisetsu commented Aug 16, 2022

stefan-k left a comment

Choose a reason for hiding this comment