Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIDS-like organization for Atlas/Library #1697

Open
TheChymera opened this issue Feb 5, 2024 · 6 comments
Open

BIDS-like organization for Atlas/Library #1697

TheChymera opened this issue Feb 5, 2024 · 6 comments

Comments

@TheChymera
Copy link
Collaborator

TheChymera commented Feb 5, 2024

The ABI mouse brain maps for gene expression and connectivity (well, projection) are a particularly useful resource for neuroimaging, where they can help correlate whole-brain maps with cellular/molecular characteristics.
Sadly, they're published via an API which takes a bit of time to understand and come as NRRD without spatial information and without being registered to a brain imaging template.

I have a while ago constructed a bunch of scripts to handle download, NIfTI-fication, registration, etc.
The archive I have produced and used thus far looks like this:
https://gin.g-node.org/TheChymera/ABI-connectivity-data_generator/src/master/procdata

It's very bare-bones and requires reading in a custom XML to properly interpret. I was thinking a BIDS-like style would perhaps make it easier for others in neuroimaging to leverage this resource.
This is what I have so far:
https://gin.g-node.org/TheChymera/ABI-connectivity-data_generator/src/master/bids

I was wondering if anybody else is interested to chime in.
I know there is BEP038 which deals with “atlases”, and to me this library is very much an atlas, but I'm not sure this was the vision for the BEP, it seems to assume as far as I can tell that an atlas is one file.
Sure, all the maps can be concatenated along a fourth dimension, but that would just end up being a gigantic file, with a gigantic JSON to properly make sense of the positional information in the fourth dimension.

Just for context, if you're wondering, this specific atlas is a collection of histological projection maps based on an injection site and an expression pattern, so you get e.g. a map of projections of different cell types from the VTA here → https://gin.g-node.org/TheChymera/ABI-connectivity-data_generator/src/master/bids/seed-VTA

@yarikoptic

Also, @dyf , what do you think about this? I think I mentioned it to you at the ODIN meeting, I think it would make your data more accessible, though I'm wondering what you think is really important from the XML and should be created a filed for in the JSON sidecar. Maybe this exercise could be relevant to some of the points raised here.

@effigies
Copy link
Collaborator

effigies commented Feb 5, 2024

I know there is BEP038 which deals with “atlases”, and to me this library is very much an atlas, but I'm not sure this was the vision for the BEP, it seems to assume as far as I can tell that an atlas is one file.

An atlas is not necessarily a single file, but a collection of related files. What are the actual contents of these files? They say FLUO, are they microscopy images, or masks/probabilistic segmentations derived from microscopy images?

@TheChymera
Copy link
Collaborator Author

They are fluorescent microscopy data reconstructed from brain slices. That's already the first snag, because FLUO doesn't currently support .nii.gz, but in this case it's FLUO for use in neuroimaging.... so I think it makes sense.

There's no segmentation, one file is one feature, i.e. the projections of one cell type from one brain area.

@effigies
Copy link
Collaborator

effigies commented Feb 5, 2024

That's already the first snag, because FLUO doesn't currently support .nii.gz, but in this case it's FLUO for use in neuroimaging.... so I think it makes sense.

Okay, so there would need to be some agreement that volumes reconstructed from microscopy are valid for NIfTI. Or possibly a more general statement that data may be converted among any BIDS-supported file formats as a derivative, in order to facilitate inter-modality analyses.

In the current framework, it might be reasonable to reconstruct these as .ome.zarr files, from which NIfTI would be a pretty simple conversion for a pipeline that needed the data in NIfTI.

Your seed-<label> and expression-<label> entities would also need to be proposed or mapped onto existing concepts. It's possible that label-<seedlabel> would work, but it's a kind of awkward fit.

If we allowed that all of these exist, then the BEP38 addition would just be an explicit atlas name:

ds/
  atlas-atlasName/
    atlas-atlasName_seed-ACAd_expression-<label>_FLUO.nii.gz
    atlas-atlasName_seed-ACAd_expression-<label>_FLUO.json
    ...

@TheChymera
Copy link
Collaborator Author

@effigies ok, I can adapt it more to BEP038 if you think it could fit the concept. What's the status on the BEP, is it almost finalized/dead? The last comment seems to have gotten no response in a while.

@effigies
Copy link
Collaborator

effigies commented Feb 5, 2024

It's quite active and nearly finalized: #1281

@TheChymera
Copy link
Collaborator Author

From #1281:


1

I don't think BEP038 supports custom fields (and maybe it shouldn't), like the seed- and expression- fields relevant for this data.

_expression is probably not related to the atlas BEP at all and rather worth checking in a separate issue with clear description on the use case.

_seed -- that relates to connectivity https://bids.neuroimaging.io/bep017 -- please check how would be expressed there?

for a workaround, as you mention, _desc is more for a "derivative data" so not a good match. May be smth like _acq- which people typically abuse for such purposes to provide additional detail on MRI acquisition would be the better one?


2

@yarikoptic

May be smth like _acq- [for the seed]

I think that's an even worse idea than desc-, since it really has nothing to do with the acquisition.
If anything acq- could describe some protocol shorthand for the actual fluorescent imaging method, but I'm not sure that's relevant enough to occupy a filename field.

bep017

Well, that doesn't introduce a seed- key-value pair, in fact it only makes use of the term “seed” to explain in more understandable terms in the BEP text what is explained more cryptically in the JSON sidecars.
Perhaps it makes sense for that BEP specifically (I actually don't think so and commented on the doc to that effect), but it certainly wouldn't help us here because:

  • There isn't really any ROI for the seed, since it's a biological procedure. Nobody measured the diffusion of the virus particles upon injection but before expression, and if they had it wouldn't match any reference parcellation.
  • The qualitative seed region is nonetheless one of the main aspects for querying the data, whether by hand or programatically, so hiding that in a JSON is a non-starter.
  • In the JSON, what we would need as extra information for the seed are the injection coordinates, technically a single point, corresponding to the centroid of the aforementioned unknown diffusion volume. That's a specification that doesn't really correspond to seed-based analysis, where even if you had a single-voxel seed it would still be specified as a voxel in affine space rather than as a point in the coordinate space.

The other thing from BEP017 that I could leverage is the _relmat suffix.
In a sense it might be more informative than _FLUO, since the modality is a poorer descriptor of the data than the fact that it represents relationships between all brain voxels and a purported specific structure.
The other issue with that is that in BEP017 seed-based connectivity data is described as a derivative, which it is.
This atlas data, however, does not need to be a derivative in that sense, it's primarily a reformatting of NRRD/XML data to NIfTI/JSON.

@effigies do you think this would be a good addition to the BEP, i.e. determining what the extension would look in light of connectivity atlases from different modalities, deciding whether to keep the modality suffix or something more generic like _relmat? If so I think a seed- key-value pair would be a good introduction, might help both this BEP and BEP017.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants