Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLE estimation and convergence #200

Closed
dowdlelt opened this issue Jan 24, 2019 · 12 comments
Closed

MLE estimation and convergence #200

dowdlelt opened this issue Jan 24, 2019 · 12 comments
Labels
question issues detailing questions about the project or its direction

Comments

@dowdlelt
Copy link
Collaborator

There have been a number of discussions regarding PCA/ICA selection and convergence and this issue is an attempt to bring these things together in one place. In testing the current implementation of tedana, using the 'mle' default, I am not seeing any dimensionality reduction. That is, the PCA step with MLE selects the maximum number of components: len(timeseries)-1 in all 6 data sets (3 different tasks) I have tested thus far (more testing to come @jbteves may have more to report as well).

This is an improvement over the problem of too few components causing convergence or other issues that @handwerkerd raises in #101, My original concern was that this was odd behavior - as I thought dimensionality reduction was required, but apparently whitening the data prior to ICA is the only 'requirement', per @emdupre's link in #101 and comments on gitter. Thankfully, the vast majority of these components explain very low variance and are then ignored - though I haven't examined them closely yet, looking forward to #86.

My new concern is that this contributes to ICA failing to converge. Currently, when tedana doesn't converge it continues onward, producing output. In #181 and #162 @tsalo is suggesting that this is probably ok and maybe even a good thing - but its not clear to me if this is something that needs to be fixed, or is acceptable, long-term behavior which touches on the points raised in #153.

This concern is unique from #144, in which convergence fails from a lack of preprocessing.

So the big questions I am left with are

  1. Are others seeing the same PC selection behavior with 'mle': the max number of components?

  2. Is data that comes from a convergence failure usable?

  3. Should we include variance based thresholding, as I believe that was what @handwerkerd was testing at one point in Should PCA component selection leverage multi-echo information? #101? I believe it can be done by setting n_components to a value between 0 and 1, per https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

Thoughts?

@tsalo
Copy link
Member

tsalo commented Jan 24, 2019

  1. Yes, at least with our test data. Although, if you look at preliminary results from the reliability analysis, you'll see that the number definitely varies by subject. Subsequent convergence failure is also generally driven by subject, with most never having it happen and some having it happen almost every time.
  2. That's definitely something we're planning to look at with the reliability analysis, but as of yet I at least don't know. My intuition is that the statistics used to select ICA components are valid, but that the denoising won't be as efficient as possible. If we imagine generating random time series for the components, we can still determine whether weights for those random time series change with echo time (i.e., are BOLD, which is very unlikely) or do not (i.e., are noise, which is likely with random data). Thus, I imagine that denoising would likely ignore BOLD signal (which is fine) and would remove noise patterns that maybe don't align as well with the noise we want to remove (e.g., motion) as we would like. Of course this is all speculative, and hopefully the reliability analysis will clear things up for us.
  3. I think including additional selection methods, including a variance threshold, is a good idea, but we shouldn't make it the default without demonstrating that it outperforms other methods. That will definitely be a part of the comparison/validation analysis.

@dowdlelt
Copy link
Collaborator Author

@tsalo this is great!

  1. I hadn't seen those figures from the reliability analysis, that's a wealth of information. Happy to have created this issue just to see those.
  2. So we are leaning heavily on the multi-echo principles, as our defense against a failure to converge causing problems. I follow the intuition, and wish I had a better understanding of PCA/ICA. Did we ever up the default convergence steps?
  3. It just crossed my mind as I was reading back through Should PCA component selection leverage multi-echo information? #101, and seemed related.

This is absolutely highlighting the importance of the comparison/validation/reliability analyses angle. Looking forward to (hopefully) contributing to that and hearing thoughts from other folks.

@tsalo
Copy link
Member

tsalo commented Jan 24, 2019

Indeed, we have increased the number of iterations to 5000 to match mdp. We also switched to using the default convergence limit, which is higher than the one we were using and thus should increase the likelihood of convergence.

@emdupre emdupre added the question issues detailing questions about the project or its direction label Jan 25, 2019
@dowdlelt
Copy link
Collaborator Author

Perhaps I am very confused here regarding whitening, but it looks like the sklearn.decomposition.PCA default is not to whiten the data.
In the code here we do not override this:

ppca = PCA(n_components='mle', svd_solver='full')

However, when calling fastica, which also has a whiten argument which we do not specify anything for, which defaults to whiten=True.

ica = FastICA(n_components=n_components, algorithm='parallel',
fun='logcosh', max_iter=5000, random_state=rand_state)

So is it more correct to say that we do not whiten the data at the PCA step (as whiten is not set) but rather as part of ICA? Perhaps this doesn't matter - I just noticed it when looking through the code and recalled previous conversations and wanted to seek clarity. This is all effectively my introduction to python, so it is likely I missed something.

@tsalo
Copy link
Member

tsalo commented Jan 31, 2019

@emdupre and I have discussed this in the past. See this PR comment for the original discussion, but the gist is that we use PCA for dimensional reduction, but not whitening, and we do the whitening in the ICA because we do need it (and we can't turn it off). I think we can and should change our choice of words throughout the code and summary text to reflect that.

@jbteves
Copy link
Collaborator

jbteves commented Jan 31, 2019

I think there may be confusion because currently at the top of #101 here, which @dowdlelt and I have referenced in various discussions (though I'm speaking for my own confusion-- I'm not sure what his interpretation was), @emdupre mentions the data must be whitened. I interpreted that as using the PCA step to perform whitening, especially since that's an available option with many PCA approaches (including scikit-learn-- though obviously it's not used in tedana) and we state that PCA is the prerequisite for a successful ICA. However, it seems that's done in the ICA. It makes sense to me to update the following:

  1. The tedpca docstring
  2. The tedica docstring
  3. Add a comment here just above the ica step within the tedica function which states that the scikit-learn will automatically (with no option to override) pre-whiten the data, potentially with a documentation link?

Could we also ask that @emdupre edit that comment to avoid further confusion? Since a lot of discussion happens and starts from the top, I feel it may set the tone for the rest of the conversation and make it clear for new readers.

I hope that this isn't viewed as redundant to your comment above, @tsalo -- I felt that it may be helpful for everybody to see where these issues are being cross-referenced and help us zero in on where people may read and become confused. This is such a critical component of operation for tedana that I think a lot of users and future contributors will read it.

@dowdlelt
Copy link
Collaborator Author

dowdlelt commented Feb 1, 2019

@tsalo I hunted all over for that comment and couldn't find it! I had this vague recollection that this had been answered, that perhaps I had seen this before and I was being redundant. I think I searched just within issues. This makes sense to me and I'm glad I was making sense of the code correctly.

@tsalo
Copy link
Member

tsalo commented Feb 1, 2019

@jbteves Thank you for tracking down where we erroneously refer to the PCA step as "whitening" the data, rather than "dimensionally reducing" it. It's a good point, and I think we can fix them pretty easily. Would you mind opening a new issue with a list of those locations? It would also serve as a good first issue for new contributors, or as low-hanging fruit for any current contributors.

@emdupre
Copy link
Member

emdupre commented Feb 1, 2019

Sorry if there's any confusion from that issue ! It sounds like it'd be a good idea to go through the issues and try to clean up those that have become sprawling discussions to more concretely align with our milestones. I'll try to allot some time for that, soon.

re updating documentation: I'd be happy to see updated docstrings ! PRs welcome, or I can tackle this myself this weekend 😸

One general idea, tying back into the top point: It looks like there are a few dimensions even to this single issue. Maybe we can focus the conversation here on the MLE and convergence questions, and open a new issue if we want to keep discussing the PCA / ICA relationship ? Or, if we'd like that conversation to happen in a PR review, that's fine as well !

@jbteves
Copy link
Collaborator

jbteves commented Feb 1, 2019

@tsalo opened as #209
@emdupre I'm happy to help with the issues sprawl-- I've been reading through them a lot lately trying to get up to speed.

@tsalo
Copy link
Member

tsalo commented Apr 28, 2019

I think that discussions of convergence are currently spread out over several issues and conversations, and also that evaluation of MLE dimensionality estimation and ICA convergence are under the umbrella of the reliability analysis at this point. @dowdlelt are you okay with closing this?

@dowdlelt
Copy link
Collaborator Author

Absolutely - lets clean some things up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question issues detailing questions about the project or its direction
Projects
None yet
Development

No branches or pull requests

4 participants