Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add the ability of users to specify an explicit HED.xml schema for validation. #527

Merged
merged 59 commits into from
Aug 6, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
1665052
Minor typographic corrections
VisLab May 18, 2020
19587e1
Added specification of HED schema to event.json sidecar
VisLab May 18, 2020
13fa051
Updated the wording of the HED schema string
VisLab May 18, 2020
aefb3b3
Add missing quote on key in last HED JSON example
happy5214 May 22, 2020
47c5519
Fix link to HED spec and minor spelling and formatting fixes
happy5214 May 22, 2020
cf876dc
Added a HEDSchemaPath option to json sidecar
VisLab May 22, 2020
e3581f9
Added a HEDSchemaPath option
VisLab May 22, 2020
43a944e
Added order of resolution in looking for HED schema
VisLab May 22, 2020
44a0bb3
Corrected a typo in the description
VisLab May 22, 2020
05d0b53
Corrected another typo
VisLab May 22, 2020
d2b5810
Update 03-hed.md
VisLab Jul 10, 2020
06dc75b
Update 03-hed.md
VisLab Jul 10, 2020
a629a36
Update 03-hed.md
VisLab Jul 10, 2020
0f25c9a
Update 03-hed.md
VisLab Jul 10, 2020
a8e4a81
Merge branch 'master' of https://github.com/bids-standard/bids-specif…
VisLab Jul 15, 2020
a3b134e
Changed to HEDVersion in dataset_description.json
VisLab Jul 15, 2020
396ab21
Added HEDVersion to the dataset description.
VisLab Jul 15, 2020
5ae795b
Merged previous changes
VisLab Jul 15, 2020
0f67cd9
Updated specification for dataset_description.json
VisLab Jul 15, 2020
62c9268
Corrected spacing in dataset_description.json
VisLab Jul 15, 2020
fc18b5c
Corrected minor typo in the update
VisLab Jul 15, 2020
c51607b
Changed OPTIONAL to RECOMMENDED for HEDVersion
VisLab Jul 15, 2020
827680e
Fix the HEDVersion to just be a filename
VisLab Jul 17, 2020
cd22bd5
Changed relative path to just XML filename
VisLab Jul 17, 2020
2e8b143
Update 03-modality-agnostic-files.md
VisLab Jul 17, 2020
1887486
Merge branch 'master' of https://github.com/bids-standard/bids-specif…
VisLab Jul 23, 2020
9d5f702
Updated the example specifying the HEDVersion
VisLab Jul 23, 2020
70fe983
Fixed the typo in the json description
VisLab Jul 23, 2020
8de7370
Worked on the explanation of the HED categorical mechanism
VisLab Jul 23, 2020
55e4b7e
Fixed a typo in the Json example
VisLab Jul 23, 2020
51c6205
Reorganized the explanation of categorical versus HED column
VisLab Jul 23, 2020
983cfac
Removed extra space
VisLab Jul 23, 2020
c3e4b29
Fixed a typo
VisLab Jul 23, 2020
d0db662
Added a json sidecar to the first events example
VisLab Jul 23, 2020
5deec36
Added a sidecar to the task-emoface example
VisLab Jul 23, 2020
f4c2269
Fixed typo in sidecar example
VisLab Jul 23, 2020
f7fc27f
Respaced example
VisLab Jul 23, 2020
60215d7
Fixed missing json quote
VisLab Jul 23, 2020
85ecc4d
Fixed misplaced JSON quote
VisLab Jul 23, 2020
0e4d8de
More spacing issues
VisLab Jul 23, 2020
e69e93f
Fixed alignment
VisLab Jul 23, 2020
763df40
Update 03-hed.md
VisLab Jul 27, 2020
5210e7f
Moved HEDVersion and updated description
VisLab Aug 2, 2020
cfee293
Replaced \_events with *_events for filenames
VisLab Aug 2, 2020
babf94c
Put the backticks in the *_events filenames
VisLab Aug 2, 2020
9b4345e
Removed trailing white space in examples
VisLab Aug 2, 2020
afb65e2
Removed space before example
VisLab Aug 2, 2020
bd410fa
Removed extra line in a couple of places
VisLab Aug 2, 2020
969ae45
Removed extra line before example
VisLab Aug 2, 2020
2ad7c13
Attempted to fix dataset_description table align; also a typo
VisLab Aug 2, 2020
8de2dc7
Travis wanted ending pipes, so they're back
VisLab Aug 2, 2020
a500452
Clarified columns of events.tsv in json
VisLab Aug 2, 2020
340de69
Explained JSON sidecar values
VisLab Aug 2, 2020
302e8cd
Fixed spacing in HEDVersion table entry.
VisLab Aug 4, 2020
a39764d
Update 05-task-events.md
VisLab Aug 4, 2020
79e3c8d
Added another newline
VisLab Aug 4, 2020
814e2e7
Update 03-hed.md
VisLab Aug 4, 2020
43c8480
Update 03-hed.md
VisLab Aug 4, 2020
0445ad9
Two-space for hard line break
VisLab Aug 4, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 16 additions & 15 deletions src/03-modality-agnostic-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,20 @@ Templates:

The file `dataset_description.json` is a JSON file describing the dataset.
Every dataset MUST include this file with the following fields:

| Field name | Definition |
| ------------------------------------------------------------------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Name | REQUIRED. Name of the dataset. |
| BIDSVersion | REQUIRED. The version of the BIDS standard that was used. |
| DatasetType | RECOMMENDED. The interpretaton of the dataset. MUST be one of `"raw"` or `"derivative"`. For backwards compatibility, the default value is `"raw"`. |
| License | RECOMMENDED. The license for the dataset. The use of license name abbreviations is RECOMMENDED for specifying a license (see [Appendix II](./99-appendices/02-licenses.md)). The corresponding full license text MAY be specified in an additional `LICENSE` file. |
| Authors | OPTIONAL. List of individuals who contributed to the creation/curation of the dataset. |
| Acknowledgements | OPTIONAL. Text acknowledging contributions of individuals or institutions beyond those listed in Authors or Funding. |
| HowToAcknowledge | OPTIONAL. Text containing instructions on how researchers using this dataset should acknowledge the original authors. This field can also be used to define a publication that should be cited in publications that use the dataset. |
| Funding | OPTIONAL. List of sources of funding (grant numbers). |
| EthicsApprovals | OPTIONAL. List of ethics committee approvals of the research protocols and/or protocol identifiers. |
| ReferencesAndLinks | OPTIONAL. List of references to publication that contain information on the dataset, or links. |
| DatasetDOI | OPTIONAL. The Document Object Identifier of the dataset (not the corresponding paper). |
| **Field name** | **Definition** |
|:------------------ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Name | REQUIRED. Name of the dataset. |
| BIDSVersion | REQUIRED. The version of the BIDS standard that was used. |
| HEDVersion | RECOMMENDED if HED tags are used. The version of the HED schema used to validate HED tags for study. |
| DatasetType | RECOMMENDED. The interpretation of the dataset. MUST be one of `"raw"` or `"derivative"`. For backwards compatibility, the default value is `"raw"`. |
| License | RECOMMENDED. The license for the dataset. The use of license name abbreviations is RECOMMENDED for specifying a license (see [Appendix II](./99-appendices/02-licenses.md)). The corresponding full license text MAY be specified in an additional `LICENSE` file. |
| Authors | OPTIONAL. List of individuals who contributed to the creation/curation of the dataset. |
| Acknowledgements | OPTIONAL. Text acknowledging contributions of individuals or institutions beyond those listed in Authors or Funding. |
| HowToAcknowledge | OPTIONAL. Text containing instructions on how researchers using this dataset should acknowledge the original authors. This field can also be used to define a publication that should be cited in publications that use the dataset. |
| Funding | OPTIONAL. List of sources of funding (grant numbers). |
| EthicsApprovals | OPTIONAL. List of ethics committee approvals of the research protocols and/or protocol identifiers. |
| ReferencesAndLinks | OPTIONAL. List of references to publication that contain information on the dataset, or links. |
| DatasetDOI | OPTIONAL. The Document Object Identifier of the dataset (not the corresponding paper). |

Example:

Expand All @@ -53,7 +53,8 @@ Example:
"https://www.ncbi.nlm.nih.gov/pubmed/001012092119281",
"Alzheimer A., & Kraepelin, E. (2015). Neural correlates of presenile dementia in humans. Journal of Neuroscientific Data, 2, 234001. http://doi.org/1920.8/jndata.2015.7"
],
"DatasetDOI": "10.0.2.3/dfjj.10"
"DatasetDOI": "10.0.2.3/dfjj.10",
"HEDVersion": "7.1.1"
}
```

Expand Down
53 changes: 50 additions & 3 deletions src/04-modality-specific-files/05-task-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,9 @@ the first volume.

An arbitrary number of additional columns can be added. Those allow describing
other properties of events that could be later referred in modelling and
hypothesis extensions of BIDS. Note that any additional columns in a TSV file
SHOULD be documented in an accompanying JSON sidecar file.
hypothesis extensions of BIDS.
Note that the `trial_type` and any additional columns in a TSV file SHOULD be
documented in an accompanying JSON sidecar file.

In case of multi-echo task run, a single `_events.tsv` file will suffice for all
echoes.
Expand All @@ -65,6 +66,21 @@ onset duration trial_type response_time stim_file
5.6 0.6 stop 1.739 images/blue_square.jpg
```

In the accompanying JSON sidecar, the `trial_type` column might look as follows:

```JSON
{
"trial_type": {
"LongName": "Event category",
"Description": "Indicator of type of action that is expected",
"Levels": {
"go": "A red square is displayed to indicate starting",
"stop": "A blue square is displayed to indicate stopping",
}
}
}
```
effigies marked this conversation as resolved.
Show resolved Hide resolved

References to existing databases can also be encoded using additional columns.
Example 2 includes references to the Karolinska Directed Emotional Faces (KDEF)
database<sup>6</sup>:
Expand All @@ -86,7 +102,38 @@ onset duration trial_type identifier database response_time
5.6 0.6 sad AF01ANSA kdef 1.739
```

For multi-echo files events.tsv file is applicable to all echos of particular
This file should be accompanied by a data dictionary.

```Text
sub-control01/
func/
sub-control01_task-emoface_events.json
```

effigies marked this conversation as resolved.
Show resolved Hide resolved
The `trial_type` and `identifier` columns from the `*_events.tsv` files might be described in this
dictionary as follows.
Note that all other columns SHOULD also be described but are omitted for the sake
of example.

```JSON
{
"trial_type": {
"LongName": "Emotion image type",
"Descripton": "Type of emotional face from Karolinska database that is displayed",
"Levels": {
"afraid": "A face showing fear is displayed",
"angry": "A face showing anger is displayed",
"sad": "A face showing sadness is displayed"
}
},
"identifier": {
"LongName": "Unique identifier from Karolinska (KDEF) database",
"Description": "ID from KDEF database used to identify the displayed image"
}
}
effigies marked this conversation as resolved.
Show resolved Hide resolved
```

For multi-echo files, the `*_events.tsv` file is applicable to all echos of particular
run:

```Text
Expand Down
87 changes: 62 additions & 25 deletions src/99-appendices/03-hed.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Hierarchical Event Descriptors (HED) are a controlled vocabulary of terms describing events in a behavioral
paradigm. HED was originally developed with EEG in mind, but is applicable to
all behavioral experiments. Each level of a hierarchical tag is delimited with
a forward slash (`/`). An HED string contains one or more HED tags separated by
a forward slash (`/`). A HED string contains one or more HED tags separated by
commas (`,`). Parentheses (brackets, `()`) group tags and enable specification
of multiple items and their attributes in a single HED string (see section 2.4
in [HED Tagging Strategy Guide](http://www.hedtags.org/downloads/HED%20Tagging%20Strategy%20Guide.pdf)).
Expand All @@ -14,7 +14,7 @@ sidecar JSON files (`CogAtlasID` and `CogPOID`), HED tags from the `Paradigm`
HED subcategory should not be used to annotate events.

There are several ways to associate HED annotations with events within the BIDS
framework. The most direct way is to use the `HED` column of the \_events.tsv
framework. The most direct way is to use the `HED` column of the `*_events.tsv`
file to annotate events:

Example:
Expand All @@ -26,54 +26,91 @@ onset duration HED
...
```

The direct approach requires that each line in the events file must be
The direct approach requires that each line in the events file be
annotated. Since there are typically thousands of events in each experiment,
this method of annotation is usually not convenient unless the annotations are
automatically generated. In many experiments, the event instances fall into a
much smaller number of categories, and often these categories are labeled with
numerical codes or short names. It is therefore more convenient to associate
the HED annotations with these categories and allow the analysis tools to make
the association with individual event instances during analysis. To use this
approach, your \_events.tsv file should associate a category (often called an
approach, your `*_events.tsv` file should associate a category (often called an
event code) with each event instance. Since BIDS allows an arbitrary number of
columns to be included in an \_events.tsv file, you can make this association
columns to be included in an `*_events.tsv` file, you can make this association
by including columns representing various types of event categories in your
\_events.tsv file.
`*_events.tsv` file.

Example:

```Text
onset duration mycodes
1.1 n/a Fixation
1.1 n/a Fixation
1.3 n/a Button
1.8 n/a Target
...

```

If you provide an \_events.json file somewhere in your data hierarchy that has
an HED mapping for `mycodes`, the HED tags associated with a given `mycodes`
value can then be associated with the event instances in that category. You
may provide a `HED` column and multiple category columns. The union of the
relevant HED tags will then be associated with the event instance.
The tags in the `HED` column of the `*_events.tsv` file are often specific to the individual event instances,
while the common properties are represented by categorical values appearing in other columns.
You may provide a `HED` column and multiple categorical columns to document your events.
Each of these categorical columns should be documented in a corresponding `*_events.json` sidecar.
The column name (e.g., `mycodes`) is the dictionary key to this documentation, as illustrated by the following example.

Example:

```JSON
{
"mycodes": {
"HED": {
"Fixation": "Event/Category/Experimental stimulus, Event/Label/CrossFix, Event/Description/A cross appears at screen center to serve as a fixation point, Sensory presentation/Visual, Item/Object/2D Shape/Cross, Attribute/Visual/Fixation point, Attribute/Visual/Rendering type/Screen, Attribute/Location/Screen/Center",
"Target": "Event/Label/Target image, Event/Description/A white airplane as the RSVP target superimposed on a satellite image is displayed., Event/Category/Experimental stimulus, (Item/Object/Vehicle/Aircraft/Airplane, Participant/Effect/Cognitive/Target, Sensory presentation/Visual/Rendering type/Screen/2D), (Item/Natural scene/Arial/Satellite, Sensory presentation/Visual/Rendering type/Screen/2D)",
"Button": "..."
"mycodes": {
"LongName": "Local event type names",
"Description": "Main types of events that comprise a trial",
"Levels": {
"Fixation": "Fixation cross is displayed",
"Target": "Target image appears",
"Button": "Subject presses a button"
},
"HED": {
"Fixation": "Event/Category/Experimental stimulus, Event/Label/CrossFix,
Event/Description/A cross appears at screen center to serve as a fixation point,
Sensory presentation/Visual, Item/Object/2D Shape/Cross,
Attribute/Visual/Fixation point, Attribute/Visual/Rendering type/Screen,
Attribute/Location/Screen/Center",
"Target": "Event/Label/TargetImage, Event/Category/Experimental stimulus,
Event/Description/A white airplane as the RSVP target superimposed on a satellite image is displayed.,
Item/Object/Vehicle/Aircraft/Airplane, Participant/Effect/Cognitive/Target,
Sensory presentation/Visual/Rendering type/Screen/2D),
(Item/Natural scene/Arial/Satellite,
Sensory presentation/Visual/Rendering type/Screen/2D)",
"Button": "Event/Category/Participant response, Event/Label/PressButton,
Event/Description/The participant presses the button as soon as the target is visible,
Action/Button press"
}
}
}
}
```
Downstream tools should not distinguish between tags specified using the explicit HED column and
the categorical specifications, but should form the union before analysis.
Further, the [inheritance principle](../02-common-principles.md#the-inheritance-principle) applies,
so the data dictionaries can appear higher in the BIDS hierarchy.

The HED vocabulary is specified by a HED schema, which delineates the allowed
HED path strings. By default, BIDS uses the latest HED schema available in the
[hed-specification](https://github.com/hed-standard/hed-specification/tree/master/hedxml) repository
maintained by the hed-standard group.

You can override the default by providing a specific HED version number in the
`dataset_description.json` file using the `HEDVersion` field.
The preferred approach is to validate with the latest version (the default),
but to use the `HEDVersion` field to specify which version was used for later reference.

Example: The following `dataset_description.json` file specifies that
`HED7.1.1.xml` from the [hed-specification](https://github.com/hed-standard/hed-specification/tree/master/hedxml) repository
should be used to validate the study event annotations.

The tags in the `HED` column are often specific to the event instances, while
the common properties associated with categories such as `mycodes` are
encapsulated in the \_events.json dictionary. Downstream tools should not
distinguish between tags specified using the different mechanisms. Further,
the normal BIDS inheritance principle applies so these data dictionaries can
appear higher in the BIDS hierarchy.
```JSON
{
"Name": "The mother of all experiments",
"BIDSVersion": "1.4.0",
"HEDVersion": "7.1.1"
}
```