Skip to content

Commit

Permalink
Formalize BIDSEntitiesLayout
Browse files Browse the repository at this point in the history
Aims to provide a solution to

- bids-standard/bids-2-devel#54

### Name rationale:

Originally I thought to name it BIDSLayout but that one was/is used as
a class in pybids. On one hand it is great because corresponds in "principles".
But I thought to avoid confusion at least ATM so to make it easier to find
issues/code where such a term is used/mentioned.  So for now decided to go
with BIDSEntitiesLayout but it would be easy to change to anything we want.
  • Loading branch information
yarikoptic committed Apr 27, 2024
1 parent 59fa77e commit f8856f4
Show file tree
Hide file tree
Showing 2 changed files with 105 additions and 0 deletions.
97 changes: 97 additions & 0 deletions src/modality-agnostic-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ and a guide for using macros can be found at
{
"Name": "REQUIRED",
"BIDSVersion": "REQUIRED",
"BIDSEntitiesLayout": "REQUIRED if layout is not just sub-*/<datatype>",
"HEDVersion": "RECOMMENDED",
"DatasetLinks": "REQUIRED if [BIDS URIs][] are used",
"DatasetType": "RECOMMENDED",
Expand Down Expand Up @@ -103,6 +104,102 @@ Example:
}
```

#### BIDS Entities Layout

The `BIDSEntitiesLayout` field is REQUIRED if the layout of the dataset is not just `sub-*/<datatype>` of a typical BIDS 1.0 dataset without sessions.

The `BIDSEntitiesLayout` field is an object where keys are directory names and values are lists of objects describing order of entities defining hierarchy and/or file name prefixes that are expected to be found in the directory.

**TODO**: define "Entity" and "Datatype" in glossary.

**TODO**: check if *concept* is the best term to use here.

It could be an [entity](appendix/entity-table.md) or the [datatype](glossary.md#datatype) concept to define that level of the hierarchy.

By default, each level represented both as a directory and a prefix in the file name, e.g. having `subject` entity would establish `sub-<label>/` directory and `sub-<label>_` prefix for the filename.
If the directory is not expected, the `directory` key can be set to `false`.
If the prefix is not expected, the `prefix` key can be set to `false`.

The order of entities in the list is the order of hierarchy levels, from the root to the leaf.
It overrides the default order of entities in the BIDS specification, so if the dataset has `session` entity before `subject`, it should then also be listed first in the filename, e.g. `ses-<label>_sub-<label>`.

The scope of the `BIDSEntitiesLayout` field is the dataset root directory and stops at encountering embedded BIDS dataset (a directory with `dataset_description.json` containing `BIDSVersion` key).
Corollary: the `BIDSEntitiesLayout` field is not inherited by subdatasets.

JsonSchema for `BIDSEntitiesLayout` (elaborated with chatgpt):

```json
{
"type": "object",
"patternProperties": {
"^[./a-z]+$": {
"type": "array",
"items": {
"type": "object",
"properties": {
"entity": {
"type": "string"
},
"concept": {
"type": "string",
"enum": ["entity", "datatype"],
"default": "entity"
},
"prefix": {
"type": "boolean"
},
"directory": {
"type": "boolean"
}
},
"additionalProperties": false,
"if": {
"properties": {
"concept": {
"const": "entity"
}
}
},
"then": {
"required": ["entity"]
},
"anyOf": [
{
"required": ["entity"]
},
{
"required": ["concept"]
}
]
}
}
},
"additionalProperties": false
}
```

**Note**: In LinkML ChatGPT gave a nice schema but without conditional requirements.

Examples of the values for BIDSEntitiesLayout:

- `{ "." : [ {"entity": "subject"}, {"concept": "datatype", "prefix": false} ]` - (Default) BIDS 1.0 without sessions
- `{ "." : [ {"entity": "subject"}, {"entity": "session"}, {"concept": "datatype", "prefix": false} ] }` - BIDS 1.0 with sessions
- ```json
{ "." : [
{"entity": "subject", "directory": false},
{"entity": "session", "directory": false},
{"concept": "datatype", "prefix": false}]
}
```
Nested in sub-/ses- subdataset (ref: [devel#59: Ability to compose BIDS dataset from BIDS datasets per sub/ses](https://github.com/bids-standard/bids-2-devel/issues/59))

Alternative specification ideas:

- Sugaring, using convention that `{entity}` is equivalent to `[{"entity": "{entity}"}]` and `{entity}/` is equivalent to `[{"entity": "{entity}", "directory": True}]`, and `datatype/` concept is a "hardcoded" value and corresponds to `{"concept": "datatype", "prefix": false}`.
So BIDS 1.0 with sessions could be `{ "." : ["subject/", "session/", "datatype/"] }`.



#### Derived dataset and pipeline description

As for any BIDS dataset, a `dataset_description.json` file MUST be found at the
Expand Down
8 changes: 8 additions & 0 deletions src/schema/objects/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,14 @@ B1ShimmingTechnique:
The technique used to shim the *B<sub>1</sub>* field (for example, `"Simple phase align"`
or `"Pre-saturated TurboFLASH"`).
type: string
BIDSEntitiesLayout:
name: BIDSEntitiesLayout
display_name: BIDS Entities Layout
description: |
The specification of the hierarchy of directories and leading entities
in the filenames of BIDS files.
TODO: expand on type etc. Formalization is WiP within docs...
type: object
BIDSVersion:
name: BIDSVersion
display_name: BIDS Version
Expand Down

0 comments on commit f8856f4

Please sign in to comment.