-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support OWL-QN method (L-BFGS with L1-regularization) #244
Conversation
Thanks a lot for this PR! I will try to review it during the weekend (but unfortunately I can't guarantee it). At first sight I was wondering if it would be better to have a dedicated OWL-QN solver instead of adding it to the existing L-BFGS. This would lead to code duplication but it may also be easier to understand and to maintain in the long run (and maybe it would be also better in terms of performance). But I'm not sure yet about what's the best approach. I'd be interested in your opinion on this. |
I also initially thought it would be better to define a separate structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for this valuable contribution! I did a quick initial review with mostly minor style-related comments. Could you also provide an example owl_qn.rs
in the examples
folder, please? This would help me to better understand what is going on and it would of course also be useful for users.
Yes, after having a more detailed look at the code I tend to agree. The only downside is the increased number of math-related trait bounds which add an additional implementation burden on people using their own types. But I think this is acceptable for now. We can reconsider this if it starts to become a problem. |
Thank you for your review. I fixed them. # Rosenbrock w/o regularization
best = (0.0, 0.0, float('inf'))
for x in np.arange(0, 2, 0.001):
for y in np.arange(0, 2, 0.001):
z = (1.0 - x)**2 + 100 * (y - x**2)**2
if z < best[2]:
best = (x, y, z)
print(best)
#=> (1.0, 1.0, 0.0)
# Rosenbrock w/ L1 regularization (coeff = 1.0)
best = (0.0, 0.0, float('inf'))
for x in np.arange(0, 2, 0.001):
for y in np.arange(0, 2, 0.001):
z = (1.0 - x)**2 + 100 * (y - x**2)**2 + np.abs(x) + np.abs(y)
if z < best[2]:
best = (x, y, z)
print(best)
#=> (0.249, 0.057, 0.8725020001) TODO: Add the L1 term to the cost function of LineSearchProblem. |
if let Some(xi) = self.xi.as_ref() { | ||
let zeros = param.zero_like(); | ||
let param = P::max(¶m.mul(xi).signum(), &zeros).mul(param); | ||
self.problem.cost(¶m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefan-k This line should be as follows:
self.problem.cost(¶m) | |
self.problem.cost(¶m) + self.l1_coeff.unwrap() * param.l1_norm() |
Should I add the l1_norm() function to the ArgminNorm
trait? Or should I create a new trait for the L1 norm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding a dedicated trait ArgminL1Norm
.
But then ArgminNorm
should be renamed to ArgminL2Norm
. If you want, you can also make that change, but it is absolutely fine if you don't want to do it because it is quite some work. I can also make that change after merging this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that you had to wait! I can't keep up with your impressive pace.
Looks really good, thanks a lot for all the effort, I really appreciate it!
There is only one minor thing. Unfortunately I can't place a comment at the line that I'm talking about, so I'll have to describe it. Could you please add a small text about how this can be turned into an OWL-QN in the docs of the LBFGS struct, right above the "TODO" with the headline ## OWL-QN
? This way users will be able to more easily make the link between the L1 regularization and OWL-QN.
After that I'd only ask you to squash your commits a bit and then I think this is ready to go! :)
Thank you for your review. I updated the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! Thanks once again for all your work, I really appreciate this great addition to the library! :)
Fix #243
This PR adds the L1-regularization option to
LBFGS
.This option enables L1-regularization performed by the OWL-QN method.