Compute pointwise log-likelihood for each observation #1300

DavAug · 2021-02-21T12:36:26Z

ArviZ provides a simple API to compute the LOO or WAIC for performance assessment of models, see https://arviz-devs.github.io/arviz/api/generated/arviz.waic.html.

What this would require however is the pointwise log-likelihood scores of the parameters in a chain for each observation. So for N obervations and M iterations and K chains, we would need to store NMK log-pdf values.

The computationally most efficient way to generate the pointwise log-likelihoods would potentially be to store the while running the chain before summing them up across observations. That would require some changes in our pints.LogPDF, pints.LogPosterior and the pints.MCMCSampler / pints.MCMCController though.

Alternatively, we could consider to implement a routine that takes the LogPDF of a problem and the chains and then computes the log-pdfs for the observations again. This would still require us to implement an additional method for the LogPDFs which returns the pointwise log-pdfs.

MichaelClerx · 2021-02-23T14:35:17Z

Discussed in meeting today:

Pointwise log-likelihood = log likelihood of every point in a ProblemLogLikelihood, before summing
Logical entry to add this in would be somewhere in ProblemLogLikelihood ?

ben18785 · 2021-02-24T17:28:07Z

I've actually realised that Stan doesn't save a point-wise log-likelihood as it runs. Instead, it computes it afterwards using each posterior sample. I think, however, that we should probably try to improve on this since our models are generally more expensive to run.

Rebecca-Rumney · 2021-03-05T13:54:09Z

I've been looking into how to do this and this is my idea:
It requires changing the __call__ function of each ProblemLogLikelihood and adding 2 new functions so that:

def __call__(self, x):
    pointwise = self.create_pointwise_loglikelihoods(x)
    self._last_pointwise_loglikelihoods = pointwise
    return np.sum(pointwise)

def create_pointwise_loglikelihoods(self, parameters):
    """
    Returns a matrix of size nt x no containing the log likelihood of each observation and at each time point 
    with the given parameters
    """

def get_last_pointwise_loglikelihoods(self):
    return self._last_pointwise_loglikelihoods

This allows there to be not much change to code already written but if you want to get the pointwise log likelihoods using the ask and tell interface you use get_last_pointwise_loglikelihoods at each step without doing the calculations again. I believe this will also work with using the LogPosterior or similar for the telling. You can also choose to do it the stan way as well if you need to, using the create_pointwise_loglikelihoods.

DavAug · 2021-03-05T17:40:50Z

I think this looks really good and fits very nicely into the pints interface @Rebecca-Rumney !

A little bit unrelated to the API, I am wondering whether it is actually a good idea to store the pointwise log-pdfs always, as for large autocorrelations we may want to throw out a majority of the samples and the memory requirements can be quite large for larger datasets (so we might not actually save the energy needed for the computation as we need it for storage). So it's probably good to be able to switch storing of the pointwise log-pdfs off if we want. But I guess that will be a switch in the MCMCController?

Rebecca-Rumney · 2021-03-05T19:09:46Z

@DavAug That's a good point. What I've written there only saves the last step's log-likelihoods (so of size N) rather than the whole N x M x K matrix and it is up to the user to store it somewhere. I'm personally not sure how large N is likely to get. If we have it as an option to turn on then it may make it harder to access if we are only calling for the posterior. We would then have to alter LogPosterior and anything else that calls the log posterior to have an option of saving the pointwise likelihoods.

DavAug · 2021-03-06T08:02:14Z

I agree, I like the likelihood as you proposed! I guess I was more wondering whether we would want to store the 3 dimensional tensor in the MCMC controller in general, or maybe don’t store it by default and allow to switch it on. But maybe this is a question for another ticket :D Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Rebecca-Rumney <notifications@github.com> Sent: Friday, March 5, 2021 8:10:02 PM To: pints-team/pints <pints@noreply.github.com> Cc: David Augustin <david.augustin@cs.ox.ac.uk>; Mention <mention@noreply.github.com> Subject: Re: [pints-team/pints] Compute pointwise log-likelihood for each observation (#1300) @DavAug<https://github.com/DavAug> That's a good point. What I've written there only saves the last step's log-likelihoods (so of size N) rather than the whole N x M x K matrix and it is up to the user to store it somewhere. I'm personally not sure how large N would likely get. If we have it as an option to turn on then it may make it harder to access if we are only calling for the posterior, we would then have to alter LogPosterior and anything else that calls the log posterior to have an option of saving the pointwise likelihoods. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#1300 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEY2T3XPIUVZMPZLOSZLUOLTCEUAVANCNFSM4X66FFBQ>.

MichaelClerx · 2021-03-16T10:33:15Z

Good start! But probably it'd be more efficient to have __call__ just assume you don't want to save, and have some alternative method like evaluateS1 that can be called if you really want to store each sample?

(I imagine there's some loss of performance if we do this by default, but we might want to benchmark that)

DavAug added the feature label Feb 21, 2021

Rebecca-Rumney mentioned this issue Mar 31, 2021

I1300 pointwise log likelihoods #1334

Open

pints-team locked and limited conversation to collaborators Jun 23, 2024

MichaelClerx converted this issue into discussion #1672 Jun 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Compute pointwise log-likelihood for each observation #1300

Compute pointwise log-likelihood for each observation #1300

DavAug commented Feb 21, 2021 •

edited

Loading

MichaelClerx commented Feb 23, 2021

ben18785 commented Feb 24, 2021

Rebecca-Rumney commented Mar 5, 2021 •

edited

Loading

DavAug commented Mar 5, 2021

Rebecca-Rumney commented Mar 5, 2021 •

edited

Loading

DavAug commented Mar 6, 2021 via email

MichaelClerx commented Mar 16, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Compute pointwise log-likelihood for each observation #1300

Compute pointwise log-likelihood for each observation #1300

Comments

DavAug commented Feb 21, 2021 • edited Loading

MichaelClerx commented Feb 23, 2021

ben18785 commented Feb 24, 2021

Rebecca-Rumney commented Mar 5, 2021 • edited Loading

DavAug commented Mar 5, 2021

Rebecca-Rumney commented Mar 5, 2021 • edited Loading

DavAug commented Mar 6, 2021 via email

MichaelClerx commented Mar 16, 2021

This issue was moved to a discussion.

DavAug commented Feb 21, 2021 •

edited

Loading

Rebecca-Rumney commented Mar 5, 2021 •

edited

Loading

Rebecca-Rumney commented Mar 5, 2021 •

edited

Loading