-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Clarify sample column values to be 0-based and possibly non-integer #1441
Conversation
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## master #1441 +/- ##
=======================================
Coverage 88.01% 88.01%
=======================================
Files 14 14
Lines 1268 1268
=======================================
Hits 1116 1116
Misses 152 152 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
We need fractional samples for EEG, MEG, and iEEG - and I think this update goes against this. We need fractional samples because events are often acquired with different hardware, which has a resolution higher than the EEG. For example, a typical EEG file can be sampled at 500 Hz, and events can be at 1000 Hz or more (this is useful because, in psychophysics, you need millisecond precision on events). Most BIDS export tools will allow placing events from different software (presentation, Eprime) into the EEG event file. |
Could you recommend alternative wording? I was aiming to allow fractional samples, but if it can be read as restricting them, we have a problem. |
This is what I remember from the BEP development process:
The If we make |
The sample column also came up in the INCF/BEP032 meeting on Wednesday regarding animal electrophysiology data. Note that there we identified that "sample" appears to be used both for "data sample" (common terminology in EEG/iEEG/MEG but also in LFP) and for "biological sample" as used in BIDS for microscopy. But in this issue it is about "data samples". Wednesday during the online meeting I was not able to find "sample" defined in the BIDS spec, but I did point to some examples that have it in their events.tsv. Here is one, here another. I have used it myself in my own datasets, but was under the impression that Given that I thought |
Back in version 1.2.1 sample was a recommend column:
https://github.com/bids-standard/bids-specification/blob/3d55770c21ac5cc9015c93adb9bc145d0318bfa1/src/04-modality-specific-files/05-task-events.md
Maybe we just want to it take out of the events altogether.
…On Fri, Mar 17, 2023 at 7:40 AM Robert Oostenveld ***@***.***> wrote:
The sample column also came up in the INCF/BEP032 meeting on Wednesday
regarding animal electrophysiology data. Note that there we identified that
"sample" appears to be used both for "data sample" (common terminology in
EEG/iEEG/MEG but also in LFP) and for "biological sample" as used in BIDS
for microscopy
<https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/10-microscopy.html#required-samples-file>.
But in this issue it is about "data samples".
Wednesday during the online meeting I was not able to find "sample"
defined in the BIDS spec, but I did point to some examples that have it in
their events.tsv. Here is one
<https://github.com/bids-standard/bids-examples/blob/master/eeg_rishikesh/sub-001/ses-02/eeg/sub-001_ses-02_task-meditation_events.tsv>,
here another
<https://github.com/bids-standard/bids-examples/blob/master/eeg_matchingpennies/sub-07/eeg/sub-07_task-matchingpennies_events.tsv>
.
I have used it myself in my own datasets, but was under the impression
that sample in events.tsv was an additional column not specified by BIDS.
Just like countdown_offset and value, it would be appropriate to document
them in a corresponding events.json.
Given that I thought sample was my own free choice, I was a bit confused
when I found this in the yaml file
<https://github.com/sappelhoff/bids-specification/blob/8f21b45d78b41b81dc84f1c3dc9c3c516f74e11e/src/schema/objects/columns.yaml#L391>
which I think is the reason for sample appearing in the glossary
<https://bids-specification.readthedocs.io/en/stable/glossary.html#sample-columns>.
I think this is now also the reason for this issue being discussed, i.e. to
formally define sample. But is that needed? We can also leave it to the
user (and hope that the events.json is specific about it being 0- or
1-indexed). It is redundant with the onset column anyway, right?
—
Reply to this email directly, view it on GitHub
<#1441 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJCJOUC6KITUY3I5D3NN7LW4RLULANCNFSM6AAAAAAV4HQTD4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Ah, thanks @sappelhoff. I now see that it is mentioned in the current spec regarding the task events. In general I personally prefer the BIDS specification to be as small as possible, but as large as needed. I don't see the need for this being defined in the spec, as it can also be specified in the events.json. Hence I would not object if |
If we were to remove (effectively "demote") Not sure whether that would classify as a backward incompatible change, as some dataset curators may have relied on using the definition of I am in favor of making the definition of The only "wonky" thing about my proposal is that we have to define post-hoc whether In other news: What happens if a dataset curator uses a pre-defined column (like Let's discuss this latter thing here: |
After talking with @effigies yesterday, I am now also in favor of demoting For these reasons:
|
Based on discussion in bids-standard/bids-examples#356, it seems that defining the
sample
column ofevents.tsv
as containing integer values may have been mistaken. This column was introduced in 1.2.0, when EEG and iEEG were added to the specification, and there does not appear to be consensus among contributors to those BEPs that this was intended to be an integer, with @arnodelorme stating:If sample is not purely an index, then it is a value where the time unit is not seconds. Therefore it should be possible to say "0 samples from the start of the run", resolving issue #499 in favor of 0-based "indexing".
This PR is necessary in order to correctly validate events.tsv files that contain non-integer sample indices.