Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need permission to update the HED validator component of BIDS for units #942

Closed
VisLab opened this issue Apr 11, 2020 · 9 comments · Fixed by #1015
Closed

Need permission to update the HED validator component of BIDS for units #942

VisLab opened this issue Apr 11, 2020 · 9 comments · Fixed by #1015

Comments

@VisLab
Copy link
Member

VisLab commented Apr 11, 2020

All,
A group of us has been working on improving the HED (https://github.com/hed-standard) event annotation schema and tools. My lab has contributed the HED validation component of the current BIDS validator.

One of the things that we have worked on is units associated with HED tags. BIDS uses SI units (for the most part). The HED schema had a limited list of allowed units. We have created a new branch of the HED schema that allows all of the SI units specified by bids including multiples and sub-multiples. We have also created a branch of the hed-validator that handles this. We would like to merge this branch into our master and incorporate it into the bids-validator.

The issue is that our new validator will no longer support previous versions of the HED schema. We think that this is an important step and should be done as soon as possible for the community. It should not affect datasets that have already been tagged with HED. It mainly requires that they will need to validate against HED schemas 8.0.0 or later.

We want to know if you foresee any issues or there are objections before we do a pull request to make this happen.

@sappelhoff
Copy link
Member

Hi @VisLab, thanks for checking in and cool to hear about HED improvements.

We have created a new branch of the HED schema that allows all of the SI units specified by bids

Yes, BIDS RECOMMENDS SI units unless impossible (we are using the word SHOULD in the spec, not MUST), but we the specification is a bit unclear as to how the SI units should be formatted, see bids-standard/bids-specification#411

so my question is: what formatting does the new HED schema accept for units? E.g., micro volts could be theoretically formatted as any of the following: µV, microV, uV, ...

including multiples and sub-multiples

could you please explain what you mean by that? Combined units like V/s or something like that?

The issue is that our new validator will no longer support previous versions of the HED schema.

given that datasets until now have been tagged with previous HED schema versions ... how can these datasets NOT be affected? What happens if the new validator encounters a previous version of HED schemas?

@VisLab
Copy link
Member Author

VisLab commented Apr 11, 2020 via email

@sappelhoff
Copy link
Member

Thanks for the details, unfortunately there is still something that I don't understand (perhaps @effigies or @rwblair can help me).

Specifically, I don't see how these two points you mention go together [emphasis added by me]:

  1. All of the existing allowed units are still in there so data that has been previously tagged will still validate
  2. We added several attributes to the schema (and this is why our
    validators are not backwards compatible without some hassle).

If old valid HED datasets are still valid with the new HED validator, that would be what I call backward compatible. In a case of backward compatibility, I a see no issue to update the HED validator within bids-validator.

Perhaps you could once more specify what you meant with this sentence from your original post [parts in [] added by me]:

It mainly requires that they [the users] will need to validate against HED schemas 8.0.0 or later.

When users apply bids-validator, they do not specify any HED schema - at least I do not see an option like this in the CLI parameters that are possible (such as --ignoreNiftiHeaders)

@VisLab
Copy link
Member Author

VisLab commented Apr 12, 2020 via email

@sappelhoff
Copy link
Member

Users can specify a schema that is earlier explicitly in the hed-validator. I think that we need to discuss how we might have an optional command line argument for the BIDS validator to specify a HED schema that is not the latest..

that is one option. I think that another option would be to add a field to some BIDS .json (e.g., the events.json file) a new HED field called something like HEDSchemaVersion, where users put the version of the schema they used for populating the HED column in their events.tsv file.

This could further be clarified in the HED section in the BIDS specification, because currently, there is nothing about different HED schemas there.

And then the HED-validator within bids-validator could (1) search for the HEDSchemaVersion and apply the appropriate HED-validator, or (2) if no HEDSchemaVersion field is found, apply the latest HED-validator.

What do you think?

@happy5214
Copy link
Collaborator

Basically, the new validator branch as written would not permit users to validate against a schema that doesn't include unit modifiers, or everything older than 8.0.0. However, since all explicit units included in earlier versions of the schema (including the current version) are either explicitly in version 8.0.0 or are derivable from those units through plurals and unit prefixes, so no HED string valid now should become invalid because of these changes. But HED strings that are already invalid with the current schema could not be validated at all using the schema version for which they were written, since it wouldn't load the schema at all given the current code.

In bids-validator/validators/events/hed.js, line 102, the buildSchema function exported by the hed-validator package is called with no arguments. That causes the schema code in hed-validator to default to the Latest version, which is a manually maintained symlink in the HED specification repo that points to the latest released version.

There is currently no mechanism through which to pass a version to buildSchema through bids-validator. It could be done by passing as a parameter to buildSchema an object that contains either a path (for a local schema copy) or version (which loads the corresponding version from the same HED spec repo). However, we'd need to get a version override through the code to the call to buildSchema. I don't know how hard it is to pass a command-line option that deep. The code is already loading events.json through the sidecar mechanism to work with the custom column values, so adding the schema version to that would not be difficult.

@effigies
Copy link
Collaborator

This sounds reasonable to me.

@VisLab
Copy link
Member Author

VisLab commented Apr 13, 2020

Sounds good to me as well. We have also had a discussion about doing some minor restructuring of our code so that:

  1. if the schema doesn't have unit modifiers, then only explicitly given unit classes values will be allowed as units.
  2. if the schema doesn't have unit classes, the validator doesn't validate units at all.

We believe that these changes will allow the validator to work with previous versions of the schema. We'll get back to you after we look at this in more detail.

@happy5214
Copy link
Collaborator

The changes described above have been made to hed-validator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants