Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] BIDS Derivatives-compatible outputs #574

Merged
merged 34 commits into from
Feb 24, 2021
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
21e479c
BIDS Derivatives-compatible outputs.
tsalo May 22, 2020
9dd2728
Fix outputs.
tsalo May 22, 2020
41c9512
Merge branch 'master' into ref/filenames-and-variables
tsalo Jul 13, 2020
66b0a0c
Fix filenames.
tsalo Jul 14, 2020
000997e
Merge branch 'master' into ref/filenames-and-variables
tsalo Sep 22, 2020
eb23f74
Merge branch 'master' into ref/filenames-and-variables
tsalo Oct 22, 2020
737b19c
Reformat MIR function with black.
tsalo Oct 22, 2020
0b82189
Update fiu_four_echo_outputs.txt
tsalo Oct 22, 2020
d00b03a
Merge branch 'main' into ref/filenames-and-variables
tsalo Feb 6, 2021
51c7a7b
Reorganize metric outputs.
tsalo Feb 6, 2021
65784d8
Write out PCA outputs.
tsalo Feb 7, 2021
7f88281
Fix tests.
tsalo Feb 7, 2021
141c1b1
Fix up gscontrol.
tsalo Feb 7, 2021
b6a375c
Fix row indexing.
tsalo Feb 7, 2021
06bd83f
Fix suffix.
tsalo Feb 7, 2021
de0526f
Simplify handling of component indices.
tsalo Feb 7, 2021
b5b69a1
Fix.
tsalo Feb 7, 2021
840b1c0
Fix the reports.
tsalo Feb 7, 2021
7e70a7b
Fix the bugs.
tsalo Feb 7, 2021
860bff3
Fix more bugs.
tsalo Feb 9, 2021
e587849
Fix everything!
tsalo Feb 9, 2021
bb15c2e
Fix the loading?
tsalo Feb 9, 2021
4303b97
Update fiu_four_echo_outputs.txt
tsalo Feb 9, 2021
1aa52bc
Fix outputs!
tsalo Feb 9, 2021
2b1524f
Ugh...
tsalo Feb 10, 2021
6b4ab31
Add dataset_description.json.
tsalo Feb 10, 2021
ed45069
Update outputs and workflow description.
tsalo Feb 10, 2021
f7a4103
Fix output list.
tsalo Feb 10, 2021
fb7e8f8
Fix more outputs.
tsalo Feb 10, 2021
39a18f7
Remove unused functions.
tsalo Feb 10, 2021
245c7bd
Fix imports.
tsalo Feb 10, 2021
ed85699
Change desc-ICA to desc-tedana.
tsalo Feb 12, 2021
cd1ce51
Update docs.
tsalo Feb 12, 2021
cc9296b
Update docs/outputs.rst
tsalo Feb 23, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions docs/approach.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ estimate voxel-wise :math:`T_{2}^*` and :math:`S_0`.
:math:`S_0` corresponds to the total signal in each voxel before decay and can reflect coil sensivity.
:math:`T_{2}^*` corresponds to the rate at which a voxel decays over time, which
is related to signal dropout and BOLD sensitivity.
Estimates of the parameters are saved as **t2sv.nii.gz** and **s0v.nii.gz**.
Estimates of the parameters are saved as **T2starmap.nii.gz** and **S0map.nii.gz**.

While :math:`T_{2}^*` and :math:`S_0` in fact fluctuate over time, estimating
them on a volume-by-volume basis with only a small number of echoes is not
Expand Down Expand Up @@ -153,7 +153,7 @@ between the distributions for other echoes.

The time series for the optimally combined data also looks like a combination
of the other echoes (which it is).
This optimally combined data is written out as **ts_OC.nii.gz**
This optimally combined data is written out as **desc-optcom_bold.nii.gz**

.. image:: /_static/a10_optimal_combination_timeseries.png

Expand Down Expand Up @@ -194,7 +194,7 @@ component analysis (PCA).
The goal of this step is to make it easier for the later ICA decomposition to converge.
Dimensionality reduction is a common step prior to ICA.
TEDPCA applies PCA to the optimally combined data in order to decompose it into component maps and
time series (saved as **mepca_mix.1D**).
time series (saved as **desc-PCA_mixing.tsv**).
Here we can see time series for some example components (we don't really care about the maps):

.. image:: /_static/a11_pca_component_timeseries.png
Expand Down Expand Up @@ -277,9 +277,9 @@ Next, ``tedana`` applies TE-dependent independent component analysis (ICA) in
order to identify and remove TE-independent (i.e., non-BOLD noise) components.
The dimensionally reduced optimally combined data are first subjected to ICA in
order to fit a mixing matrix to the whitened data.
This generates a number of independent timeseries (saved as **meica_mix.1D**),
as well as beta maps which show the spatial loading of these components on the
brain (**betas_OC.nii.gz**).
This generates a number of independent timeseries (saved as **desc-ICA_mixing.tsv**),
as well as parameter estimate maps which show the spatial loading of these components on the
brain (**desc-ICA_components.nii.gz**).

.. image:: /_static/a13_ica_component_timeseries.png

Expand Down Expand Up @@ -312,13 +312,13 @@ The blue and red lines show the predicted values for the :math:`S_0` and
A decision tree is applied to :math:`\kappa`, :math:`\rho`, and other metrics in order to
classify ICA components as TE-dependent (BOLD signal), TE-independent
(non-BOLD noise), or neither (to be ignored).
These classifications are saved in `comp_table_ica.txt`.
These classifications are saved in **desc-tedana_metrics.tsv**.
The actual decision tree is dependent on the component selection algorithm employed.
``tedana`` includes the option `kundu` (which uses hardcoded thresholds
applied to each of the metrics).

Components that are classified as noise are projected out of the optimally combined data,
yielding a denoised timeseries, which is saved as `dn_ts_OC.nii.gz`.
yielding a denoised timeseries, which is saved as **desc-optcomDenoised_bold.nii.gz**.

.. image:: /_static/a15_denoised_data_timeseries.png

Expand Down
218 changes: 118 additions & 100 deletions docs/outputs.rst

Large diffs are not rendered by default.

89 changes: 63 additions & 26 deletions tedana/decomposition/pca.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
PCA and related signal decomposition methods for tedana
"""
import json
import logging
import os.path as op
from numbers import Number
Expand Down Expand Up @@ -148,13 +149,13 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,

This function writes out several files:

====================== =================================================
Filename Content
====================== =================================================
pca_decomposition.json PCA component table.
pca_mixing.tsv PCA mixing matrix.
pca_components.nii.gz Component weight maps.
====================== =================================================
=========================== =============================================
Filename Content
=========================== =============================================
desc-PCA_decomposition.json PCA component table
desc-PCA_mixing.tsv PCA mixing matrix
desc-PCA_components.nii.gz Component weight maps
=========================== =============================================

See Also
--------
Expand Down Expand Up @@ -231,10 +232,10 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
# Compute Kappa and Rho for PCA comps
# Normalize each component's time series
vTmixN = stats.zscore(comp_ts, axis=0)
comptable, _, _, _ = metrics.dependence_metrics(
comptable, _, metric_metadata, _, _ = metrics.dependence_metrics(
data_cat, data_oc, comp_ts, adaptive_mask, tes, ref_img,
reindex=False, mmixN=vTmixN, algorithm=None,
label='mepca_', out_dir=out_dir, verbose=verbose)
label='PCA', out_dir=out_dir, verbose=verbose)

# varex_norm from PCA overrides varex_norm from dependence_metrics,
# but we retain the original
Expand All @@ -243,38 +244,74 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
comptable['normalized variance explained'] = varex_norm

# write component maps to 4D image
comp_ts_z = stats.zscore(comp_ts, axis=0)
jbteves marked this conversation as resolved.
Show resolved Hide resolved
comp_maps = utils.unmask(computefeats2(data_oc, comp_ts_z, mask), mask)
io.filewrite(comp_maps, op.join(out_dir, 'pca_components.nii.gz'), ref_img)
comp_maps = utils.unmask(computefeats2(data_oc, comp_ts, mask), mask)
io.filewrite(comp_maps, op.join(out_dir, 'desc-PCA_stat-z_components.nii.gz'), ref_img)

# Select components using decision tree
if algorithm == 'kundu':
comptable = kundu_tedpca(comptable, n_echos, kdaw, rdaw, stabilize=False)
comptable, metric_metadata = kundu_tedpca(
comptable,
metric_metadata,
n_echos,
kdaw,
rdaw,
stabilize=False,
)
elif algorithm == 'kundu-stabilize':
comptable = kundu_tedpca(comptable, n_echos, kdaw, rdaw, stabilize=True)
comptable, metric_metadata = kundu_tedpca(
comptable,
metric_metadata,
n_echos,
kdaw,
rdaw,
stabilize=True,
)
else:
alg_str = "variance explained-based" if isinstance(algorithm, Number) else algorithm
LGR.info('Selected {0} components with {1} dimensionality '
'detection'.format(comptable.shape[0], alg_str))
comptable['classification'] = 'accepted'
comptable['rationale'] = ''

# Save decomposition
# Save decomposition files
comp_names = [io.add_decomp_prefix(comp, prefix='pca', max_value=comptable.index.max())
for comp in comptable.index.values]

mixing_df = pd.DataFrame(data=comp_ts, columns=comp_names)
mixing_df.to_csv(op.join(out_dir, 'pca_mixing.tsv'), sep='\t', index=False)

comptable['Description'] = 'PCA fit to optimally combined data.'
mmix_dict = {}
mmix_dict['Method'] = ('Principal components analysis implemented by '
'sklearn. Components are sorted by variance '
'explained in descending order. '
'Component signs are flipped to best match the '
'data.')
io.save_comptable(comptable, op.join(out_dir, 'pca_decomposition.json'),
label='pca', metadata=mmix_dict)
mixing_df.to_csv(op.join(out_dir, 'desc-PCA_mixing.tsv'), sep='\t', index=False)

# Save component table and associated json
temp_comptable = comptable.set_index("Component", inplace=False)
temp_comptable.to_csv(
op.join(out_dir, "desc-PCA_metrics.tsv"),
index=True,
index_label="Component",
sep='\t',
)
metric_metadata["Component"] = {
"LongName": "Component identifier",
"Description": (
"The unique identifier of each component. "
"This identifier matches column names in the mixing matrix TSV file."
),
}
with open(op.join(out_dir, "desc-PCA_metrics.json"), "w") as fo:
json.dump(metric_metadata, fo, sort_keys=True, indent=4)

decomp_metadata = {
"Method": (
"Principal components analysis implemented by sklearn. "
"Components are sorted by variance explained in descending order. "
"Component signs are flipped to best match the data."
),
}
for comp_name in comp_names:
decomp_metadata[comp_name] = {
"Description": "PCA fit to optimally combined data.",
"Method": "tedana",
}
with open(op.join(out_dir, "desc-PCA_decomposition.json"), "w") as fo:
json.dump(decomp_metadata, fo, sort_keys=True, indent=4)

acc = comptable[comptable.classification == 'accepted'].index.values
n_components = acc.size
Expand Down
45 changes: 36 additions & 9 deletions tedana/gscontrol.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import os.path as op

import numpy as np
import pandas as pd
from scipy import stats
from scipy.special import lpmv

Expand Down Expand Up @@ -78,23 +79,38 @@ def gscontrol_raw(catd, optcom, n_echos, ref_img, out_dir='.', dtrank=4):
detr = dat - np.dot(sol.T, Lmix.T)[0]
sphis = (detr).min(axis=1)
sphis -= sphis.mean()
io.filewrite(utils.unmask(sphis, Gmask), op.join(out_dir, 'T1gs'), ref_img)
io.filewrite(
utils.unmask(sphis, Gmask),
op.join(out_dir, 'desc-globalSignal_map.nii.gz'),
ref_img
)

# find time course ofc the spatial global signal
# make basis with the Legendre basis
glsig = np.linalg.lstsq(np.atleast_2d(sphis).T, dat, rcond=None)[0]
glsig = stats.zscore(glsig, axis=None)
np.savetxt(op.join(out_dir, 'glsig.1D'), glsig)

glsig_df = pd.DataFrame(data=glsig.T, columns=['global_signal'])
glsig_df.to_csv(op.join(out_dir, 'desc-globalSignal_timeseries.tsv'),
sep='\t', index=False)
glbase = np.hstack([Lmix, glsig.T])

# Project global signal out of optimally combined data
sol = np.linalg.lstsq(np.atleast_2d(glbase), dat.T, rcond=None)[0]
tsoc_nogs = dat - np.dot(np.atleast_2d(sol[dtrank]).T,
np.atleast_2d(glbase.T[dtrank])) + Gmu[Gmask][:, np.newaxis]

io.filewrite(optcom, op.join(out_dir, 'tsoc_orig'), ref_img)
io.filewrite(
optcom,
op.join(out_dir, 'desc-optcomWithGlobalSignal_bold.nii.gz'),
ref_img
)
dm_optcom = utils.unmask(tsoc_nogs, Gmask)
io.filewrite(dm_optcom, op.join(out_dir, 'tsoc_nogs'), ref_img)
io.filewrite(
dm_optcom,
op.join(out_dir, 'desc-optcomNoGlobalSignal_bold.nii.gz'),
ref_img
)

# Project glbase out of each echo
dm_catd = catd.copy() # don't overwrite catd
Expand Down Expand Up @@ -202,7 +218,9 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
mehk_ts = np.dot(comp_pes[:, acc], mmix[:, acc].T)
t1_map = mehk_ts.min(axis=-1) # map of T1-like effect
t1_map -= t1_map.mean()
io.filewrite(utils.unmask(t1_map, mask), op.join(out_dir, "sphis_hik"), ref_img)
io.filewrite(
utils.unmask(t1_map, mask), op.join(out_dir, "desc-T1likeEffect_min.nii.gz"), ref_img
)
t1_map = t1_map[:, np.newaxis]

# Find the global signal based on the T1-like effect
Expand All @@ -213,11 +231,19 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
np.linalg.lstsq(glob_sig.T, mehk_ts.T, rcond=None)[0].T, glob_sig
)
hik_ts = mehk_noT1gs * optcom_std # rescale
io.filewrite(utils.unmask(hik_ts, mask), op.join(out_dir, "hik_ts_OC_MIR"), ref_img)
io.filewrite(
utils.unmask(hik_ts, mask),
op.join(out_dir, "desc-optcomAcceptedMIRDenoised_bold.nii.gz"),
ref_img,
)

# Make denoised version of T1-corrected time series
medn_ts = optcom_mean + ((mehk_noT1gs + resid) * optcom_std)
io.filewrite(utils.unmask(medn_ts, mask), op.join(out_dir, "dn_ts_OC_MIR"), ref_img)
io.filewrite(
utils.unmask(medn_ts, mask),
op.join(out_dir, "desc-optcomMIRDenoised_bold.nii.gz"),
ref_img,
)

# Orthogonalize mixing matrix w.r.t. T1-GS
mmix_noT1gs = mmix.T - np.dot(
Expand All @@ -232,7 +258,8 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
comp_pes_norm = np.linalg.lstsq(mmix_noT1gs_z.T, optcom_z.T, rcond=None)[0].T
io.filewrite(
utils.unmask(comp_pes_norm[:, 2:], mask),
op.join(out_dir, "betas_hik_OC_MIR"),
op.join(out_dir, "desc-ICAAcceptedMIRDenoised_components.nii.gz"),
ref_img,
)
np.savetxt(op.join(out_dir, "meica_mix_MIR.1D"), mmix_noT1gs)
mixing_df = pd.DataFrame(data=mmix_noT1gs.T, columns=comptable["Component"].values)
mixing_df.to_csv(op.join(out_dir, "desc-ICAMIRDenoised_mixing.tsv"), sep='\t', index=False)
Loading