-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(schema): Provide default JSON column definition for "conventional" columns #1838
feat(schema): Provide default JSON column definition for "conventional" columns #1838
Conversation
e254ca6
to
1a62d12
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1838 +/- ##
==========================================
+ Coverage 88.04% 88.06% +0.02%
==========================================
Files 16 16
Lines 1380 1391 +11
==========================================
+ Hits 1215 1225 +10
- Misses 165 166 +1 ☔ View full report in Codecov by Sentry. |
1a62d12
to
795015f
Compare
That sounds reasonable to me. Shall there be some "info" logging for these cases? 👇
|
…ified in sidecar and schema rules for that column are in the 'json schema' defined style outlined in bids-standard/bids-specification#1838
7d8f448
to
ab41d81
Compare
* Modify tsv column value type checks to use user defined types if specified in sidecar and schema rules for that column are in the 'json schema' defined style outlined in bids-standard/bids-specification#1838 * Fix merge * Remove unused ts error suppression * missed a merge markers * Update bids-validator/src/schema/applyRules.ts Co-authored-by: Chris Markiewicz <effigies@gmail.com> --------- Co-authored-by: Chris Markiewicz <effigies@gmail.com>
I don't think so. debug logging perhaps, but the standing expectation is that if you use approved values for these columns, the validator won't complain. |
776b29a
to
7f72c37
Compare
7f72c37
to
2e58636
Compare
df1559a
to
40ca7bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal makes sense to me, as does the change to the schema.
In BIDS we have three types of columns:
participant_id
or thetype
column inelectrodes.tsv
. These are ones where having JSON-schema style definitions of values is convenient, as we can rely on a well-established way of validating values. There is no good use case for overriding these, since the meaning should not vary from dataset to dataset.age
orhandedness
. These are used in the spec and it was decided some time ago that we will let people use them without complaining about being ill-defined, as long as they use them consistently with the specification.For (1) there is no good use case of overriding, since the meaning should not vary from dataset to dataset. For (2) these definitions may be broader or narrower than is really useful, and we would like people to define what they mean. For convenience, tools should fall back to the BIDS spec for interpretation. For (3) the data are unlikely to be usable in an automated fashion without a JSON definition.
Currently (1) and (2) are defined in the same way in the schema, which leaves a gap because our column description object is not as powerful as JSON schema. When overriding conventional columns, the user has no obvious way to produce an equivalent representation, and tools have no way to help the user by injecting the default definition and inviting the user to refine it.
This PR moves type (2) to align with type (3), so that conventional columns are validated the same way as user-defined columns, and examples are available in the specification. We do this by adding an optional
definition
field to entries inobjects.columns
which is the JSON default. It is intentionally written as a JSON object inside YAML. This change also allows a validator to identify (1) and (2) columns purely through structure, rather than adding a new flag.We have already implemented this in the schema validator in https://github.com/bids-standard/bids-validator/pull/1987 and https://github.com/bids-standard/bids-validator/pull/2003, so this is known to work with the existing BIDS examples.
The logic is as follows:
This is an edited post. Expand for original text.
Some TSV columns are very well defined, while others are conventional.
As an example, the
type
column ofelectrodes.tsv
has a very explicit meaning and list of possible values; there's no reason that someone would add a description inelectrodes.json
, and if they did, the validator should not allow it to override the specification definition.In contrast the
handedness
column ofparticipants.tsv
is conventional in that we provide suggested values and their interpretations, but we would really prefer if people provide a full definition inparticipants.json
. There are handedness definitions that are not refinements or subsets; for example, the Edinburgh Handedness Inventory is a numeric scale, not a set of labels.I would like to propose that for conventional columns, we provide an explicit example JSON definition for the column. Validators would be expected to look first for a sidecar definition, fall back to the schema-defined example, and use whatever definition is available to validate the values in the column. In the case where neither is available, we would raise a warning, as we currently do. If I get some agreement here, I will do the same thing for all other conventional columns I can find.
For fully-defined columns, we will continue using JSON schema-style type definitions, which are more explicit.
cc @sappelhoff who I believe was involved in helping to formalize these last time.