[ENH] Use non-parametric method for regression z-statistic estimation #461

tsalo · 2019-11-13T19:38:29Z

** Update: Closed in favor of a better implementation of the parametric approach by @CesarCaballeroGaudes

Closes #178.

Essentially, the problem is that the approach we use in computefeats2 to calculate voxel-wise z-values (not z-statistics) associated with our components is rather odd and wrong. See this notebook for a simple comparison of methods, looking at RMSE compared to a slow, but valid, method.

The current problem is that I'm getting the following error:

/Users/tsalo/Documents/tsalo/tedana/tedana/metrics/kundu_fit.py:118: RuntimeWarning: invalid value encountered in true_divide
  signs /= np.abs(signs)
 ** On entry to DLASCLS parameter number  4 had an illegal value
 ** On entry to DLASCLS parameter number  4 had an illegal value
Traceback (most recent call last):
  File "/Users/tsalo/anaconda/envs/python3/bin/tedana", line 11, in <module>
    load_entry_point('tedana', 'console_scripts', 'tedana')()
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/workflows/tedana.py", line 614, in _main
    tedana_workflow(**vars(options))
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/workflows/tedana.py", line 478, in tedana_workflow
    low_mem=low_mem)
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/decomposition/pca.py", line 280, in tedpca
    label='mepca_', out_dir=out_dir, verbose=verbose)
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/metrics/kundu_fit.py", line 129, in dependence_metrics
    np.repeat(mask[:, np.newaxis], len(tes), axis=1))
  File "/Users/tsalo/Documents/tsalo/tedana/tedana/stats.py", line 161, in get_coeffs
  File "<__array_function__ internals>", line 6, in lstsq
  File "/Users/tsalo/anaconda/envs/python3/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 2268, in lstsq
    x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
  File "/Users/tsalo/anaconda/envs/python3/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 109, in _raise_linalgerror_lstsq
    raise LinAlgError("SVD did not converge in Linear Least Squares")
numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

I believe that the error is due to voxels with no variability.

Changes proposed in this pull request:

Use nilearn's permuted_ols method within computefeats2 to calculate valid regressor-wise z-statistics for multiple regressions even with low degrees of freedom.

It’s not working for some reason.

rmarkello · 2019-11-15T19:09:44Z

If we're really going to go non-parametric for the estimation of these standardized beta weights I think we should be considering bootstrapping instead of permutation testing. The p-to-z conversion is a bit hacky/post-hoc when we could instead just go from first principle and bootstrap the estimates to derive a "population" standard error and normalize with that.

I agree that the current method is less than optimal, and I'm still open to using the permutations, but I think it's worthwhile discussing the merits of the different choices. Would you mind explaining your preference for permutations over bootstrapping, here?

tsalo · 2019-11-15T20:23:28Z

@CesarCaballeroGaudes is working on a parametric version that calculates the z-statistics properly, so if that ends up working I think we'll drop this and just use that method. If there somehow ends up being a problem with the results, I think we should dig into a nonparametric approach further.

tsalo · 2019-12-01T16:45:59Z

@rmarkello, I'm planning to close this PR given @CesarCaballeroGaudes' planned PR, but I was wondering if the bootstrapped estimates approach would be an improvement over Nilearn's permuted_ols more generally?

rmarkello · 2019-12-05T17:41:30Z

Sorry for the delay in responding to this! I think that, generally speaking, if you want to estimate standard errors around any statistical metric then bootstrapping is the way to go (over permutation testing) 👍

That said, it sounds like @CesarCaballeroGaudes's planned PR will supersede this and should be sufficient for the purposes of tedana. Bootstrapping can end up being quite computationally expensive and since we're trying to slim down tedana that might not be the best route.

tsalo added 4 commits November 12, 2019 23:13

Add functions.

9fb7a09

It’s not working for some reason.

Simplify computefeats2 call.

38fb544

Remove print statements.

995b159

Reduce number of permutations.

7883154

tsalo mentioned this pull request Dec 1, 2019

Changing z scores computation in stats.py #477

Closed

tsalo closed this Dec 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Use non-parametric method for regression z-statistic estimation #461

[ENH] Use non-parametric method for regression z-statistic estimation #461

tsalo commented Nov 13, 2019 •

edited

Loading

rmarkello commented Nov 15, 2019

tsalo commented Nov 15, 2019

tsalo commented Dec 1, 2019

rmarkello commented Dec 5, 2019

[ENH] Use non-parametric method for regression z-statistic estimation #461

[ENH] Use non-parametric method for regression z-statistic estimation #461

Conversation

tsalo commented Nov 13, 2019 • edited Loading

rmarkello commented Nov 15, 2019

tsalo commented Nov 15, 2019

tsalo commented Dec 1, 2019

rmarkello commented Dec 5, 2019

tsalo commented Nov 13, 2019 •

edited

Loading