add bids metadata extraction #432

satra · 2021-03-01T14:38:11Z

as we deal with non-nwb dandisets, would be good to add bids metadata extraction, which may require parsing the tree (to get age, sex from participants.tsv, etc.,.).

yarikoptic · 2021-03-01T15:08:50Z

I think this functionality should as much as possible align/reuse with https://github.com/datalad/datalad-neuroimaging/blob/master/datalad_neuroimaging/extractors/bids.py which ATM is just a dump of metadata as provided by pybids.
But IIRC @mih mentioned that in the scope of ebrains openminds he is consider (or just advising?) to provide more "tight" harmonization. @mih could you briefly chime in on the plans on that end here? (or just add references)

satra · 2021-03-01T15:19:32Z

alignment is good, but we will want to fill in the fields of our asset metadata structure as well about participants and biosamples.

mih · 2021-03-01T16:25:48Z

What I was talking about in that meeting was that a bids2openminds conversion is taking place outside the scope of a metadata extractor. An extractor should report "as-is". If the metadata source (like BIDS), it not "semantically clean", a subsequent (and updatable) transformation can be used to yield a "better" (or just different) record.

I realized at some point that doing the standardization at the level of an extractor implies that any application of updates to that standardization requires actual data access, and also makes metadata extraction an inherently open-ended process. Adding the possibility to for customizable transformations of metadata seems much more practical, when data access is complicated (which it seems to be for most datasets).

satra · 2021-06-09T15:09:34Z

@yarikoptic - perhaps we can add some bids support in the short term with respect to participant id and a few other things.

@mih - in our case metadata extraction is performed at the point of validation/upload so access is there. in the future we may want to extend the schema, for which we would indeed need to pull in the directory structure (especially for bids, where the inheritance principle does apply for some metadata).

yarikoptic · 2021-06-09T16:24:48Z

yeah, I guess we shouldn't postpone for too long. I do not think we should at this point anyhow to amalgam data + sidecar files into a single asset, so we will keep it KISS and have an asset per each file, be it a data, sidecar, or metadata. dataset_description.json will also be a first-class-citizen and have an asset.

Do you know perspective datasets which would be uploaded and should be BIDS?

yarikoptic mentioned this issue Jun 9, 2021

Start populating/using *Asset.dataType ? dandi/dandi-schema#32

Open

yarikoptic mentioned this issue Jul 6, 2021

(re)design metadata extractors and "harmonization" across multiple extractors whenever needed #701

Closed

1 task

satra mentioned this issue Sep 20, 2021

simpler short-term BIDS metadata extraction #772

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add bids metadata extraction #432

add bids metadata extraction #432

satra commented Mar 1, 2021

yarikoptic commented Mar 1, 2021

satra commented Mar 1, 2021

mih commented Mar 1, 2021

satra commented Jun 9, 2021

yarikoptic commented Jun 9, 2021

add bids metadata extraction #432

add bids metadata extraction #432

Comments

satra commented Mar 1, 2021

yarikoptic commented Mar 1, 2021

satra commented Mar 1, 2021

mih commented Mar 1, 2021

satra commented Jun 9, 2021

yarikoptic commented Jun 9, 2021