Add fast computation of functional_variance for DiagLLLaplace and KronLLLaplace #145

wiseodd · 2024-02-24T18:36:34Z

PR for #138. Very useful for LLMs or diffusion models or any models with many outputs.

TODO: Implementation for KronLLLaplace. I'd like input from @aleximmer, who's the author of matrix.py.

laplace/lllaplace.py

…place

wiseodd · 2024-02-24T21:49:15Z

laplace/lllaplace.py

@@ -201,6 +208,40 @@ def __init__(self, model, likelihood, sigma_noise=1., prior_precision=1.,
    def _init_H(self):
        self.H = Kron.init_from_model(self.model.last_layer, self._device)

+    def _functional_variance_fast(self, X):


@aleximmer here's the initial implementation for the KronLLLaplace. The test, comparing this f_var to the f_var = la.posterior_precision.inv_square_form(Js) fails tho...

Could you please check this? Feel free to propose a more elegant solution since you know more about the implementation of Kron.

What's the advantage you are trying to achieve with this? Is it faster because you do the damping formulation of the posterio update (eigenvalues + sqrt(delta))? I suppose that approximation makes the test fail potentially.

The idea is to use this identity of the matrix-Normal distribution: (See https://arxiv.org/pdf/2002.10118.pdf, Appendix B.1)

Then, it's much faster than the naive functional_variance since we don't need to compute the Jacobian which is (batch_size, num_classes, num_params). We only need to multiply the inverse-Kronecker factors with the last layer features $\phi(x)$.

Let me know your thoughts on the best way to achieve this

PS. the rationale for the sqrt damping thing: I follow the KFAC-Laplace https://openreview.net/pdf?id=Skdvd2xAZ. Let me know what you think.

I looked a bit more into this now. It's a very important change to have this in since it's significantly faster. The way it should probably be implemented is by using the damping=True/False flag. The exact inversion with a prior would have to be done using an eigendecomposition, which we are doing right now, and can be avoided when we use the fast predictive by using the damping formulation instead. This also would avoid the recomputation of U and V from the eigendecomposition and we could add the method to matrix.py. I could look into how to do this best. One thing I am wondering: do you know if it is possible to do this as well for the joint posterior predictive?

Ok, great. I'll leave it to you then to do this implicitly by using damping. The corresponding test case is in tests/test_lllaplace.py -k "test_functional_variance_fast[KronLLLaplace]".

I think joint predictive can also benefit from this (code for naive). However, more thoughts need to be put here. So let's do this in a separate PR.

Add fast computation of functional_variance for DiagLLLaplace

c9e2265

wiseodd added the enhancement New feature or request label Feb 24, 2024

wiseodd requested review from aleximmer and runame February 24, 2024 18:36

wiseodd commented Feb 24, 2024

View reviewed changes

laplace/lllaplace.py Outdated Show resolved Hide resolved

[WIP] Initial implementation of fast functional variance for KronLLLa…

89f4b5f

…place

wiseodd commented Feb 24, 2024

View reviewed changes

runame changed the base branch from main to mc-subset2 March 1, 2024 14:58

runame changed the base branch from mc-subset2 to main March 1, 2024 14:59

wiseodd marked this pull request as ready for review March 3, 2024 19:22

Merge with master

1911f9f

wiseodd mentioned this pull request Mar 12, 2024

Failing tests: KFACLinearOperator object has no attribute _mapping #150

Closed

runame added this to the 0.1 milestone Mar 12, 2024

wiseodd mentioned this pull request Mar 12, 2024

[Discussion] Last-layer Laplace for img2img problem #118

Open

wiseodd linked an issue Mar 15, 2024 that may be closed by this pull request

Last Layer Laplace predictions could be computed much faster and becomes problematic for large label classification. #138

Closed

wiseodd added 2 commits March 15, 2024 15:48

Merge branch 'main' into speedup_llla

8e86f29

Make the behavior of regression functional variance consistent

0b249ac

wiseodd self-assigned this Apr 26, 2024

wiseodd mentioned this pull request Apr 26, 2024

Fast KronLLLaplace joint predictive #175

Open

wiseodd assigned aleximmer Apr 27, 2024

wiseodd added 5 commits April 27, 2024 15:10

Merge branch 'main' into speedup_llla

2536d42

Merge branch 'main' into speedup_llla

12c6249

Merge branch 'main' into speedup_llla

2144d95

Disable fast variance for KronLLLaplace since still WIP

88f656e

Fix ruff lint error

1030ce5

wiseodd merged commit e3ca2c6 into main Jun 30, 2024
3 checks passed

wiseodd deleted the speedup_llla branch June 30, 2024 15:49

wiseodd mentioned this pull request Jun 30, 2024

Implement KronLLLaplace fast functional variance #201

Open

wiseodd mentioned this pull request Aug 8, 2024

Questions about defining a subnet by disabling gradients #217

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fast computation of functional_variance for DiagLLLaplace and KronLLLaplace #145

Add fast computation of functional_variance for DiagLLLaplace and KronLLLaplace #145

wiseodd commented Feb 24, 2024 •

edited

Loading

wiseodd Feb 24, 2024

aleximmer Mar 1, 2024

wiseodd Mar 2, 2024

wiseodd Mar 2, 2024

aleximmer Apr 26, 2024

wiseodd Apr 26, 2024

wiseodd Apr 26, 2024

Add fast computation of functional_variance for DiagLLLaplace and KronLLLaplace #145

Add fast computation of functional_variance for DiagLLLaplace and KronLLLaplace #145

Conversation

wiseodd commented Feb 24, 2024 • edited Loading

wiseodd Feb 24, 2024

Choose a reason for hiding this comment

aleximmer Mar 1, 2024

Choose a reason for hiding this comment

wiseodd Mar 2, 2024

Choose a reason for hiding this comment

wiseodd Mar 2, 2024

Choose a reason for hiding this comment

aleximmer Apr 26, 2024

Choose a reason for hiding this comment

wiseodd Apr 26, 2024

Choose a reason for hiding this comment

wiseodd Apr 26, 2024

Choose a reason for hiding this comment

wiseodd commented Feb 24, 2024 •

edited

Loading