Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change decomposition files from json back to tsv #649

Closed
tsalo opened this issue Jan 15, 2021 · 6 comments · Fixed by #574 or #691
Closed

Change decomposition files from json back to tsv #649

tsalo opened this issue Jan 15, 2021 · 6 comments · Fixed by #574 or #691
Labels
refactoring issues proposing/requesting changes to the code which do not impact behavior

Comments

@tsalo
Copy link
Member

tsalo commented Jan 15, 2021

Summary

We originally had our component tables stored in TSV format, but we switched to JSONs in nipreps/fmriprep#456 to better match BIDS Derivatives convention. However, after discussing the issue with some of the BIDS folks recently (please see nipreps/fmripost-aroma#10), it sounds like a TSV is more flexible and is BIDS Derivatives-compatible. We can still place metadata about the components/decomposition in a JSON.

Additional Detail

Next Steps

  1. Review BEP012 to determine what should go in the json and what should go in the tsv.
  2. Restructure outputs so that the core component table is outputted in a tsv.
  3. Update documentation.
  4. Update visualization code.
  5. Update component table ingestion step in the tedana workflow.
@tsalo tsalo added the refactoring issues proposing/requesting changes to the code which do not impact behavior label Jan 15, 2021
@tsalo
Copy link
Member Author

tsalo commented Feb 6, 2021

Somehow I forgot updates to the fMRIPrep/BIDS conversation (see summary at nipreps/fmripost-aroma#10), but basically the BIDS maintainers agreed to stick with json files for the immediate future. However, I'm still leaning toward using a TSV here and in the aroma repo, because TSVs are easier to read and because we can use a standard TSV/JSON combo to have our metrics in the TSV and metadata about the metrics in the JSON. However, I want to make sure others agree before I implement this. What does everyone think?

@tsalo
Copy link
Member Author

tsalo commented Feb 6, 2021

Here's what I'm thinking:

desc-ICA_mixing.tsv (required)

ica_00 ica_01 ica_02 ica_03 ica_04
0 1.96128 0.715816 1.81138 2.75372 1.84941
1 0.15217 -0.85096 -0.101651 0.147915 -0.363037
2 0.326455 -0.392816 1.1161 0.450553 0.580335
3 0.0141179 -0.871712 1.02556 0.678525 -0.0628587
4 -0.0459112 -1.02132 1.10107 0.306312 -0.400553

desc-ICA_decomposition.json (required)

{
    "Method": "Independent components analysis with FastICA algorithm implemented by sklearn. Components are sorted by Kappa in descending order. Component signs are flipped to best match the data.",
    "ica_00": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_01": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_02": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_03": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_04": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    }
}

desc-ICA_metrics.tsv (new)

Component kappa rho variance explained normalized variance explained countsigFR2 countsigFS0 dice_FR2 dice_FS0 countnoise signal-noise_t signal-noise_p d_table_score kappa ratio d_table_score_scrub classification rationale
ica_00 106.314 17.8163 0.507241 0.00403574 3856 1826 0.514541 0.243511 674 14.2077 7.00595e-42 4.2 0.292705 2.4 accepted nan
ica_01 91.4559 18.1357 0.569411 0.00297853 4478 2762 0.569481 0.301275 913 8.30499 4.8039e-16 6.4 0.381962 4.4 accepted nan
ica_02 81.9524 19.4905 0.33585 0.00195625 3506 2334 0.520029 0.2966 809 12.2543 2.09485e-31 7.4 0.251414 5 accepted nan
ica_03 74.4902 15.7926 0.517262 0.00306524 4043 2407 0.54551 0.189069 897 13.9354 2.5095e-40 5 0.426008 3 accepted nan
ica_04 72.7947 17.2831 0.32673 0.00231219 3986 3015 0.52485 0.238084 832 13.7875 4.04151e-39 5.4 0.275356 3.6 accepted nan

desc-ICA_metrics.json (new)

{
    "Component": {
        "Description": "The unique identifier of each component. This identifier matches column names in the mixing matrix TSV file.",
        "LongName": "Component identifier"
    },
    "classification": {
        "Description": "Classification from the classification procedure.",
        "Levels": {
            "accepted": "A BOLD-like component included in denoised and high-Kappa data.",
            "ignored": "A low-variance component included in denoised, but excluded from high-Kappa data.",
            "rejected": "A non-BOLD component excluded from denoised and high-Kappa data."
        },
        "LongName": "Component classification"
    },
    "countnoise": {
        "Description": "Number of 'noise' voxels (voxels highly weighted for component, but not from clusters) from each component.",
        "LongName": "Noise voxel count",
        "Units": "voxel"
    },
    "countsigFR2": {
        "Description": "Number of significant voxels from the cluster-extent thresholded R2 model F-statistic map for each component.",
        "LongName": "R2 model F-statistic map significant voxel count",
        "Units": "voxel"
    },
}

@eurunuela
Copy link
Collaborator

Sounds good to me!

@emdupre
Copy link
Member

emdupre commented Feb 9, 2021

They look beautiful ! Per the linked fMRIPrep discussion, though, does that mean these files are not BIDS compatible for the moment ?

@tsalo
Copy link
Member Author

tsalo commented Feb 9, 2021

From what I recall, there is nothing in the derivatives spec prohibiting them, and I think that they are ultimately cleaner than a pure-json approach, but I can't guarantee that other tools (especially fMRIPrep) will want to adopt this approach. That's part of why I want input on ME-ICA/aroma#5 as well.

@tsalo
Copy link
Member Author

tsalo commented Feb 10, 2021

One other thing we could do is use desc-tedana instead of desc-ICA for the metric files (see ME-ICA/aroma#5), in order to support multiple algorithms. I don't know how to link the metric files to the decomposition files in that case though... but it's probably something worth looking into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring issues proposing/requesting changes to the code which do not impact behavior
Projects
None yet
3 participants