Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add res and den keywords to indicate resolution of resampled data #301

Merged

Conversation

oesteban
Copy link
Collaborator

Covers the use case when one would like to have several resamplings of a derivative (i.e., keep high-res and low-res versions of data).

cc/ @bids-standard/derivatives

@oesteban oesteban marked this pull request as ready for review August 13, 2019 20:29
@franklin-feingold franklin-feingold added this to the 1.3.0 milestone Aug 14, 2019
src/05-derivatives/02-common-data-types.md Outdated Show resolved Hide resolved
{
"SkullStripped": true,
"Resolution": {
"1": "MNI305 1mm isotropic resolution",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MNI305 seems to be shoehorning the Space or CoordinateSystem metadata flag in here. If I understood correctly, the point was to dissociate space from sampling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - would limit the description to "1mm isotropic"

Aside: we have any example/suggestions for what to use for 0.9mm or 0.7mm (this is a pretty common one for anat data)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@effigies - I agree it is a bad example, not so sure it really shadows Space or CoordinateSystem. Probably something like "Matches MNI305 1mm isotropic resolution" would be better. For the sake of clarity, I'll get MNI305 out of the way. Will fix soon.

@edickie I'll add an example for that. BTW it is important to note that res/den are only to be present if necessary (because several resolutions or densities are generated).

@robertoostenveld
Copy link
Collaborator

Would this also apply to temporal resampling? That can be done both for 4D nifty BOLD, and electrophys?

@oesteban
Copy link
Collaborator Author

@robertoostenveld
Would this also apply to temporal resampling? That can be done both for 4D nifty BOLD, and electrophys?

The intent here is to describe only spatial samplings. Feel free to suggest modifications to represent temporal samplings too or a new keyword (time-?).

Another question is whether there is a use case where you have more than one time-samplings. I can see pipelines having both (spatial) high- and low-resolution pathways, but I haven't seen such a thing with the time dimension of fMRI.

Copy link
Collaborator

@edickie edickie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no "space" keyword in the CIFTI example? isn't space-fsaverage still required in the filename or the .json?

Also - the CIFTI example should also remove the reference template (space) from the densitiy description?

@oesteban
Copy link
Collaborator Author

Why no "space" keyword in the CIFTI example?

Following the spec, that would mean data remain in the original BOLD "space". And that is completely correct regardless of from where and how you brought in the prior knowledge (i.e., the location of volumetric and surface sampling positions). For simplicity, I preferred to keep the concept of space away.

isn't space-fsaverage still required in the filename or the .json?

Following the previous point, space-fsaverage here would be a misrepresentation of the data

Also - the CIFTI example should also remove the reference template (space) from the densitiy description?

I can add the language Matched with as I did in previous examples, if that feels clearer. The intent of this description is to quickly tell how densely you sample the manifold ("what we are looking at? - a surface of 41k points per hemisphere with one corresponding fMRI time series per vertex"), it is not to keep track of the provenance ("okay, this is 41k vertices per hemisphere because it comes from fsaverage6").

@robertoostenveld
Copy link
Collaborator

The intent here is to describe only spatial samplings. Feel free to suggest modifications to represent temporal samplings too or a new keyword (time-?).

Another question is whether there is a use case where you have more than one time-samplings. I can see pipelines having both (spatial) high- and low-resolution pathways, but I haven't seen such a thing with the time dimension of fMRI.

It is the principle rather than the specific topic of resampling along a specific dimension: I believe that keywords limited to one modality are better not be part of common derivatives, but be under the modality specific derivatives. Otherwise all keywords of all modalities (including future BEPs for other data types) should be documented in common derivatives, which would become a mess.

Common derivatives should describe metadata that is common and that - in principle - could apply to all modalities. E.g. the operating system on which the pipeline ram, the software details, component versions, the type of CPU or GPU, etc. Hence I would expect derivatives/01-introduction.md to be relatively small and derivatives/02-mri.md more elaborate.

@oesteban
Copy link
Collaborator Author

oesteban commented Aug 16, 2019 via email

@robertoostenveld
Copy link
Collaborator

This addition applies to PET/SPECT, CT, derived contrast maps, or any 3D object with samples in a regular grid or a surface..

yes, which means that right now it only covers 1 of the 5 modalities in the specification (since CT and PET are not yet part of the specification).

I can see the point that some derivative extensions apply to more than one (i.e. MRI, PET and CT; we would also have that with MEG, EEG and iEEG, and I would also expect BEH and EYE to have that overlap. So common-imaging, and common-electrophys and common-behavioral make sense. But common-to-all should be distinguished from imaging.

@robertoostenveld
Copy link
Collaborator

This addition applies to PET/SPECT, CT, derived contrast maps, or any 3D object with samples in a regular grid or a surface..

Yes, which means that it only applies to 1 out of the 5 modalities that is presently included in the specification (PET and CT are not yet).

It makes sense to have common-imaging-derivatives (for MRI, PET and CT), common-electrophys-derivatives (for MEG, EEG and iEEG) and common-behavioral-derivatives (for BEH and eyetracking), but common-for-all-derivatives should not prescribe details for either one, unless it really applies to that derivative.

@effigies
Copy link
Collaborator

Okay, so it sounds like @robertoostenveld's primary concern here is with the res keyword being specifically spatial, as temporal resolution is the resolution of interest in electrophysiological analysis. Would it be reasonable to recast this res as sres for spatial resolution (or vres for volumetric resolution) and save tres for temporal resolution?

They're not quite as readable (especially tres, which is an actual word...), so I'm happy to hear alternative suggestions.

Another option is for res- to have different meanings in different modalities, but this would require it to be the case that you won't generally want to combine data across modalities and indicate that you have, for example, estimated a 50Hz time series in a 1mm isotropic voxel grid. I don't know how likely that would be, so I can't really issue a preference.

Common derivatives should describe metadata that is common and that - in principle - could apply to all modalities. E.g. the operating system on which the pipeline ram, the software details, component versions, the type of CPU or GPU, etc. Hence I would expect derivatives/01-introduction.md to be relatively small and derivatives/02-mri.md more elaborate.

@robertoostenveld I'm going to be reworking the MR-specific sections of common derivatives to make just this sort of distinction. I'll ping all derivatives folks to have a look when I have done.

@robertoostenveld
Copy link
Collaborator

I don't have a preference for a specific name or so, and don't object to res itself. It is just that it should be clear that the keyword applies only to imaging data, not to behavioral or electrophys. I have had many electrophys people expressing confusion about BIDS documentation that is MRI biased, asking whether fields like acq, rec, dir, mod etcetera also apply for their data.

Oh, now that I think of it, we put quite some effort into making this table, not only to document their possible presence and order, but also to ensure consistency between similar types (like EEG and iEEG). The table is relevant to keep in mind for mapping keys (as in <key>-<value) to datatypes/formats.

Regarding the res equivalent for electrophys: that is something to discuss with the electrophys folks, but I expect them to phrase this as "resampling" rather than "changing the resolution", and that the value would rather be expressed in Hz (i.e. rate) rather than seconds (i.e. duration between samples). I.e., I suspect fsample to be preferred by the electrophys field.

If temporal resampling were an issue for processing 4D NIFTI (which it is not at the moment), then it would not surprise me if fMRI folks would prefer that to be expressed in seconds (duration between samples, i.e. TR) rather than Hz (rate). In that case I would also be fine with another keyword than fsample for derived 4D bold data.

This example also makes me realize that the proposed res could not apply to the time axis any way, since the <value> must be specified in mm. ... or would the value be a string including the units? How about the value for den? If the units are not specified in the value, where would they be specified? Here I am thinking e.g. about high-res MRI at 100um or CT at 30um. Units, see also the appendix are tricky to get documented properly.

@effigies
Copy link
Collaborator

effigies commented Dec 9, 2019

@oesteban @edickie What's the status of this conversation?

Why no "space" keyword in the CIFTI example?

Following the spec, that would mean data remain in the original BOLD "space". And that is completely correct regardless of from where and how you brought in the prior knowledge (i.e., the location of volumetric and surface sampling positions). For simplicity, I preferred to keep the concept of space away.

isn't space-fsaverage still required in the filename or the .json?

Following the previous point, space-fsaverage here would be a misrepresentation of the data

Also - the CIFTI example should also remove the reference template (space) from the densitiy description?

I can add the language Matched with as I did in previous examples, if that feels clearer. The intent of this description is to quickly tell how densely you sample the manifold ("what we are looking at? - a surface of 41k points per hemisphere with one corresponding fMRI time series per vertex"), it is not to keep track of the provenance ("okay, this is 41k vertices per hemisphere because it comes from fsaverage6").

Now that fsLR is in common derivatives, should we add a space to the CIFTI files?

@effigies
Copy link
Collaborator

effigies commented Dec 9, 2019

@robertoostenveld @satra @emdupre Any remaining issues on your ends before merging to common-derivatives?

@satra
Copy link
Collaborator

satra commented Dec 10, 2019

@effigies - without a reference to space in the cifti json file, how would i know the underlying geometry of the cifti file, both the surface part and the volume part?

@effigies
Copy link
Collaborator

@satra If I understand Common file level metadata fields correctly, if space- is a standard template, then that defines it. If it is not, then it requires a SpatialReference metadata field.

@oesteban?

@robertoostenveld
Copy link
Collaborator

@robertoostenveld @satra @emdupre Any remaining issues on your ends before merging to common-derivatives?

I had a look at https://bids-specification.readthedocs.io/en/common-derivatives/05-derivatives/01-introduction.html and the other two pages.

In the "Introduction" I am not so happy with the phrasing SpatialReference = REQUIRED when a custom template or file is used.

I would prefer REQUIRED in case a custom reference image was used.

Other than that I am fine with it. From the @bids-standard/derivatives-electrophys perspective I believe that the common derivatives and the anatomical MRI will allow a further/future specification for derivatives of that data as well.

@oesteban
Copy link
Collaborator Author

I'm fine with @robertoostenveld suggestion. Will update including that, when #301 (comment) is more clear - if everyone is 👍 with the suggestion.

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple last notes before meeting. The heart of it will be the SpatialReference discussion.

sufficiently similar in practice to treat them equivalently.

When two or more instances of a given derivative are provided with resolution
(or surface sampling density) being the only difference between them, then the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(or surface sampling density) being the only difference between them, then the
or surface sampling density being the only difference between them, then the

| ------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| SkullStripped | REQUIRED. Boolean. Whether the volume was skull stripped (non-brain voxels set to zero) or not. |
| Resolution | REQUIRED if `res` keyword present. Dictionary of Strings to Strings. Specifies the interpretation of the resolution keyword. |
| Density | REQUIRED if `den` keyword present. Dictionary of Strings to Strings. Specifies the interpretation of the density keyword. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent with a per-res/den JSON sidecar, in which case it's just a string.

Assuming we want to permit two approaches, given that it's complicated, I think we could describe this as:

REQUIRED if <entity> is present. String or dictionary of label to string. See <section name> for details.

@effigies
Copy link
Collaborator

Unfortunately, my computer crashed with my notes from our meeting last week and I did not use a program with auto-save, so if anybody has those, feel free to share.

The upshot of the meeting, however, was that I would be working to put together a concrete use case and a tool that would allow us to go from a simple JSON representation of the relevant spaces to a sparse, but usable, Connectome-Workbench-style .wb.spec file.

To this end, I've processed some data with fMRIPrep 20.0.0 with --cifti-output enabled, and will be manually augmenting with a JSON sidecar.

fmriprep/sub-01
├── anat
...
├── func
...
│   ├── sub-01_task-repeatchange_run-1_space-fsLR_den-91k_bold.dtseries.json
│   ├── sub-01_task-repeatchange_run-1_space-fsLR_den-91k_bold.dtseries.nii
...

Currently the JSON file contains:

{"space": "HCP grayordinates", "surface": "fsLR", "volume": "MNI152NLin6Asym", "surface_density": "32k", "grayordinates": "91k"}

As this is a case where the image is not part of the outputs, it makes sense to use URLs to indicate where a file can be retrieved. I would suggest making fmriprep/space-fsLR_den-91k_bold.dtseries.json:

{
    "SpatialReference": {
        "VolumeAll": "https://github.com/templateflow/tpl-MNI152NLin6Asym/blob/796ab78/tpl-MNI152NLin6Asym_res-02_T1w.nii.gz",
        "CortexLeft": "https://github.com/templateflow/tpl-fsLR/blob/f82f453/tpl-fsLR_hemi-L_den-32k_midthickness.surf.gii",
        "CortexRight": "https://github.com/templateflow/tpl-fsLR/blob/f82f453/tpl-fsLR_hemi-R_den-32k_midthickness.surf.gii"
    }
}

For now I'll assume that URLs are acceptable, and say this should translate into:

<?xml version="1.0" encoding="UTF-8"?>
<CaretSpecFile Version="1.0">
   <MetaData>
   </MetaData>
   <DataFile Structure="CortexLeft"
             DataFileType="SURFACE"
             Selected="true">
      https://github.com/templateflow/tpl-fsLR/blob/f82f453/tpl-fsLR_hemi-L_den-32k_midthickness.surf.gii
   </DataFile>
   <DataFile Structure="CortexRight"
             DataFileType="SURFACE"
             Selected="true">
      https://github.com/templateflow/tpl-fsLR/blob/f82f453/tpl-fsLR_hemi-R_den-32k_midthickness.surf.gii
   </DataFile>
   <DataFile Structure="Invalid"
             DataFileType="VOLUME"
             Selected="true">
      https://github.com/templateflow/tpl-MNI152NLin6Asym/blob/796ab78/tpl-MNI152NLin6Asym_res-02_T1w.nii.gz
   </DataFile>
</CaretSpecFile>

A second, per-subject use case might look like fmriprep/sub-01/func/sub-01_bold.dtseries.json:

{
    "SpatialReference": {
        "VolumeAll": "../anat/sub-01_desc-preproc_T1w.nii.gz",
        "CortexLeft": "../anat/sub-01_space-fsnative_hemi-L_den-32k_midthickness.surf.gii",
        "CortexRight": "../anat/sub-01_space-fsnative_hemi-R_den-32k_midthickness.surf.gii"
        }
}

and

<?xml version="1.0" encoding="UTF-8"?>
<CaretSpecFile Version="1.0">
   <MetaData>
   </MetaData>
   <DataFile Structure="CortexLeft"
             DataFileType="SURFACE"
             Selected="true">
      ../anat/sub-01_space-fsnative_hemi-L_den-32k_midthickness.surf.gii
   </DataFile>
   <DataFile Structure="CortexRight"
             DataFileType="SURFACE"
             Selected="true">
      ../anat/sub-01_space-fsnative_hemi-R_den-32k_midthickness.surf.gii
   </DataFile>
   <DataFile Structure="Invalid"
             DataFileType="VOLUME"
             Selected="true">
      ../anat/sub-01_desc-preproc_T1w.nii.gz
   </DataFile>
</CaretSpecFile>

Before I get working on a tool, does this seem reasonable to everybody? I'm using the T1w above, but possibly dsegs make more sense?

@effigies
Copy link
Collaborator

Following a meeting with @satra, @oesteban and @mgxd, we established the following:

  1. wb_view will take URLs, as long as those URLs resolve to valid GIFTI/NIfTI files. I have separately confirmed that they will take relative paths, as well.
  2. Any conventions promoted here should be made with reference to the CIFTI-2 specification, not conventions followed by ConnectomeWB. It is the responsibility of tools to convert between these, when necessary.
  3. Additional measures should be taken on the TemplateFlow fMRIPrep side as reference implementations to ensure that good URLs that point directly to the necessary GIFTI (Conte69) and NIfTI (MNI152NLin6Asym) can be retrieved programmatically. This is not necessary for the spec work, but valid and persistent URLs will make examples more valuable.

I will try to polish up this pull request ASAP, which might need to wait until late next week based on my schedule.

Please add any notes you think are important.

cc @sappelhoff @franklin-feingold Status update.

@effigies
Copy link
Collaborator

@satra Reading the CIFTI-2 spec, it looks like we have neither CortexLeft or CORTEX_LEFT, but rather CIFTI_STRUCTURE_CORTEX_LEFT. Do you think we could say "BrainStructure values with the prefix CIFTI_STRUCTURE_ omitted` or simply that we suggest using "BrainStructure values from page 10 of the CIFTI-2 specification"?

I plan to finish drafting this in the next 1-2 days.

@satra
Copy link
Collaborator

satra commented May 20, 2020

@effigies - don't know a good answer. if every future structure starts with that prefix, then sure. but in the absence of any guarantees seems like aligning to the CIFTI -2 spec in this case makes sense.

@effigies
Copy link
Collaborator

@edickie @satra @oesteban @robertoostenveld Can you please review this PR when you get a few minutes? If there are any conceptual issues rather than just drafting issues, can we schedule a call ASAP?

@edickie
Copy link
Collaborator

edickie commented May 21, 2020

Love it so far! I have one more clarification question re: inheritance principle. For the case of spaces - we could have a top level json called:

space-fsLR_den-91k_bold.json
space-fsLR_den-91k_thickness.json
sub-01/
    anat/
        sub-01/anat/sub-01_hemi-L_space-fsLR_den-91k_thickness.shape.gii
    func/
         sub-01_task-rest_space-fsLR_den-91k_bold.dtseries.nii

or

space-fsLR_den-91k.json
sub-01/
    anat/
        sub-01/anat/sub-01_hemi-L_space-fsLR_den-91k_thickness.shape.gii
    func/
         sub-01_task-rest_space-fsLR_den-91k_bold.dtseries.nii

Do we need to clarify this in an extra example?

@oesteban oesteban requested a review from edickie May 21, 2020 20:52
Copy link
Collaborator Author

@oesteban oesteban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @effigies. I can't approve my own PR, so I'll give you my green checkmark in writing.

src/05-derivatives/01-introduction.md Outdated Show resolved Hide resolved
```JSON
{
"SpatialReference": {
"VolumeReference": "https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym_res-02_T1w.nii.gz",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we stick with constant names from CIFTI-2 for volume as well?

Suggested change
"VolumeReference": "https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym_res-02_T1w.nii.gz",
"VOLUME": "https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym_res-02_T1w.nii.gz",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm ok with changing to VOLUME but @oesteban where in cifti-2 did you find that constant?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just in case it carries other semantics

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I can explain my reasoning:

  1. A CIFTI-2 file, as currently generated by fMRIPrep, has 22 BrainModels. CORTEX_LEFT, CORTEX_RIGHT and a bunch of volumetric ones.
  2. There is a CIFTI_STRUCTURE_ALL_GREY_MATTER and CIFTI_STRUCTURE_OTHER_GREY_MATTER.
    1. These are not in an ontology, where it can be deduced that, when looking for the reference for some grey matter structure, one should also check this entry.
    2. These are hypothetically structures that could be directly encoded, which would lead to an ambiguity of whether the reference was meant specifically for those structures or, again, all grey matter.
  3. In the absence of a relevant CIFTI-2 constant, we are imposing something in BIDS. The BIDS convention for JSON keys is CamelCase.
  4. In the event that this becomes relevant to another file format, it is not guaranteed that we will get constants that in CAPITAL_SNAKE.

I'm not deeply opposed to VOLUME, but unlike the others, using doesn't reduce the impedance mismatch for CIFTI-2. So I'd lean towards BIDS-like conventions.

Co-authored-by: Oscar Esteban <code@oscaresteban.es>
@effigies
Copy link
Collaborator

@edickie...

space-fsLR_den-91k.json
sub-01/
    anat/
        sub-01/anat/sub-01_hemi-L_space-fsLR_den-91k_thickness.shape.gii
    func/
         sub-01_task-rest_space-fsLR_den-91k_bold.dtseries.nii

Do we need to clarify this in an extra example?

Are sidecars without suffixes permitted? I can't think of an example within the rest of the spec or derivatives off the top of my head. My inclination is that we don't need this, but happy to be wrong, or to include something you suggest.

@effigies
Copy link
Collaborator

I think the final two comments can be summarized:

  1. We may want to add discussion of top-level sidecars for SpatialReference.
    • My overall feeling is that this is a general principle, and unless we're introducing something new WRT inheritance, we don't have a particular need to duplicate this in the example. I don't think we do that here.
  2. We could use VOLUME instead of VolumeReference.

I think that making these changes, if the community deems them necessary, would be sufficiently small as to be doable on the main PR (#265). To get things moving on the final review, I'm going to merge shortly.

@effigies effigies merged commit a7beb3d into bids-standard:common-derivatives May 22, 2020
@oesteban oesteban deleted the enh/resolution-density branch May 22, 2020 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants