Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Use non-parametric method for regression z-statistic estimation #461

Closed
wants to merge 4 commits into from

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Nov 13, 2019

** Update: Closed in favor of a better implementation of the parametric approach by @CesarCaballeroGaudes

Closes #178.

Essentially, the problem is that the approach we use in computefeats2 to calculate voxel-wise z-values (not z-statistics) associated with our components is rather odd and wrong. See this notebook for a simple comparison of methods, looking at RMSE compared to a slow, but valid, method.

The current problem is that I'm getting the following error:

/Users/tsalo/Documents/tsalo/tedana/tedana/metrics/kundu_fit.py:118: RuntimeWarning: invalid value encountered in true_divide
  signs /= np.abs(signs)
 ** On entry to DLASCLS parameter number  4 had an illegal value
 ** On entry to DLASCLS parameter number  4 had an illegal value
Traceback (most recent call last):
  File "/Users/tsalo/anaconda/envs/python3/bin/tedana", line 11, in <module>
    load_entry_point('tedana', 'console_scripts', 'tedana')()
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/workflows/tedana.py", line 614, in _main
    tedana_workflow(**vars(options))
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/workflows/tedana.py", line 478, in tedana_workflow
    low_mem=low_mem)
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/decomposition/pca.py", line 280, in tedpca
    label='mepca_', out_dir=out_dir, verbose=verbose)
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/metrics/kundu_fit.py", line 129, in dependence_metrics
    np.repeat(mask[:, np.newaxis], len(tes), axis=1))
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/stats.py", line 161, in get_coeffs
  File "<__array_function__ internals>", line 6, in lstsq
  File "/Users/tsalo/anaconda/envs/python3/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 2268, in lstsq
    x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
  File "/Users/tsalo/anaconda/envs/python3/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 109, in _raise_linalgerror_lstsq
    raise LinAlgError("SVD did not converge in Linear Least Squares")
numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

I believe that the error is due to voxels with no variability.

Changes proposed in this pull request:

  • Use nilearn's permuted_ols method within computefeats2 to calculate valid regressor-wise z-statistics for multiple regressions even with low degrees of freedom.

@rmarkello
Copy link
Member

If we're really going to go non-parametric for the estimation of these standardized beta weights I think we should be considering bootstrapping instead of permutation testing. The p-to-z conversion is a bit hacky/post-hoc when we could instead just go from first principle and bootstrap the estimates to derive a "population" standard error and normalize with that.

I agree that the current method is less than optimal, and I'm still open to using the permutations, but I think it's worthwhile discussing the merits of the different choices. Would you mind explaining your preference for permutations over bootstrapping, here?

@tsalo
Copy link
Member Author

tsalo commented Nov 15, 2019

@CesarCaballeroGaudes is working on a parametric version that calculates the z-statistics properly, so if that ends up working I think we'll drop this and just use that method. If there somehow ends up being a problem with the results, I think we should dig into a nonparametric approach further.

@tsalo
Copy link
Member Author

tsalo commented Dec 1, 2019

@rmarkello, I'm planning to close this PR given @CesarCaballeroGaudes' planned PR, but I was wondering if the bootstrapped estimates approach would be an improvement over Nilearn's permuted_ols more generally?

@rmarkello
Copy link
Member

Sorry for the delay in responding to this! I think that, generally speaking, if you want to estimate standard errors around any statistical metric then bootstrapping is the way to go (over permutation testing) 👍

That said, it sounds like @CesarCaballeroGaudes's planned PR will supersede this and should be sufficient for the purposes of tedana. Bootstrapping can end up being quite computationally expensive and since we're trying to slim down tedana that might not be the best route.

@tsalo tsalo closed this Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

computefeats2 does not calculate z-statistics accurately
2 participants