read_raw_bids() respects info['bads'], but it really shouldn't #498

hoechenberger · 2020-08-09T19:53:59Z

Describe the bug

While working in #491 I think I've discovered a bug in _handle_channels_reading(), which is used by read_raw_bids(). Specifically, when reading data and populating the list of "bads", MNE-BIDS currently merges the information from _channels.tsv and info['bads'] in the data file. I think this is not correct, and only _channels.tsv should be honored.

mne-bids/mne_bids/read.py

Lines 259 to 263 in 8eec738

    
           bads = np.asarray(channels_dict['name'])[bad_bool] 
        
           # merge with bads already present in raw data file (if there are any) 
        
           unique_bads = set(raw.info['bads']).union(set(bads)) 
        
           raw.info['bads'] = list(unique_bads)

Steps to reproduce

Checkout hoechenberger:mark-bads (the branch from #491). Run:

import os.path as op
import mne
from mne_bids import (make_bids_basename, write_raw_bids, read_raw_bids,
                      mark_bad_channels)

data_path = mne.datasets.sample.data_path()
raw_fname = op.join(data_path, 'MEG', 'sample', 'sample_audvis_raw.fif')
bids_root = op.join(data_path, '..', 'MNE-sample-data-bids')
bids_basename = make_bids_basename(subject='01', session='01',
                                   task='audiovisual', run='01')

raw = mne.io.read_raw_fif(raw_fname, verbose=False)
print(f'The following channels are marked as bad: {raw.info["bads"]}')

write_raw_bids(raw, bids_basename=bids_basename, bids_root=bids_root,
               overwrite=True, verbose=False)

# Mark two channels as bad, RETAINING existing bads.
bads = ['MEG 0112', 'MEG 0131']
mark_bad_channels(ch_names=bads, bids_basename=bids_basename,
                  bids_root=bids_root)

raw = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root,
                    verbose=False)
print(f'After marking MEG 0112 and MEG 0131 as bad, the following channels '
      f'are now marked as bad: {raw.info["bads"]}')

# Mark the same channels as bad, NOT RETAINING any existing bads.
bads = ['MEG 0112', 'MEG 0131']
mark_bad_channels(ch_names=bads, bids_basename=bids_basename,
                  bids_root=bids_root, overwrite=True)

raw = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root,
                    verbose=False)
print(f'After marking MEG 0112 and MEG 0131 as bad and passing '
      f'`overwrite=True`, the following channels '
      f'are now marked as bad: {raw.info["bads"]}')

Expected results

The following channels are marked as bad: ['MEG 2443', 'EEG 053']

After marking MEG 0112 and MEG 0131 as bad, the following channels are now marked as bad: ['MEG 0112', 'EEG 053', 'MEG 0131', 'MEG 2443']

After marking MEG 0112 and MEG 0131 as bad and passing `overwrite=True`, the following channels are now marked as bad: ['MEG 0131', 'MEG 0112']

Actual results

The following channels are marked as bad: ['MEG 2443', 'EEG 053']

After marking MEG 0112 and MEG 0131 as bad, the following channels are now marked as bad: ['MEG 0112', 'EEG 053', 'MEG 0131', 'MEG 2443']

After marking MEG 0112 and MEG 0131 as bad and passing `overwrite=True`, the following channels are now marked as bad: ['EEG 053', 'MEG 0131', 'MEG 0112', 'MEG 2443']

Additional information

Looking into sub-01_ses-01_task-audiovisual_run-01_channels.tsv, it's clear that it's being updated correctly:

$ grep bad sub-01_ses-01_task-audiovisual_run-01_channels.tsv
MEG 0112        MEGGRADPLANAR   T/m     0.10000000149011612     172.17630004882812      Planar Gradiometer      600.614990234375        bad     n/a
MEG 0131        MEGMAG  T       0.10000000149011612     172.17630004882812      Magnetometer    600.614990234375        bad     n/a
$

The text was updated successfully, but these errors were encountered:

agramfort · 2020-08-09T20:13:36Z

Agreed

jasmainak · 2020-08-09T20:27:20Z

but isn't the STATUS column in BIDS optional?

Closes mne-tools#498

hoechenberger · 2020-08-09T20:29:32Z

but isn't the STATUS column in BIDS optional?

Yes. But if you want to mark channels as bad in BIDS, you will have to bite the bullet and add this column :)

jasmainak · 2020-08-09T20:31:26Z

yes, but then we have a lossy reader, it does not get all the information that is there in the file

jasmainak · 2020-08-09T20:32:44Z

I think many Elekta users may feel it is sufficient to make the dataset pass the validator but will be surprised that the bad channels are not read in. We shouldn't require the users to know all the BIDS standard by heart and know where to put things.

hoechenberger · 2020-08-09T20:35:54Z

yes, but then we have a lossy reader, it does not get all the information that is there in the file

We already do, e.g. regarding events…

hoechenberger · 2020-08-09T20:36:55Z

We shouldn't require the users to know all the BIDS standard by heart and know where to put things.

MNE-BIDS writes the bads to channels.tsv. I would actually suggest we clear info['bads'] upon writing, too.

hoechenberger · 2020-08-09T20:38:59Z

Following your suggestion would also have implications on #491 – currently I only update the sidecar, but you're suggesting we write an entire updated raw file / info dict to disk as well. I mean, we could do that, but it doesn't seem very elegant…

hoechenberger · 2020-08-09T20:40:23Z

I'm really just super concerned about keeping everything in sync if we start allowing multiple places to store the same info. Esp since one of them (info['bads']) is lossy in the sense that it cannot hold a status_description.

jasmainak · 2020-08-09T20:43:17Z

MNE-BIDS writes the bads to channels.tsv.

now the question is do you want only MNE-BIDS written files to be readable in MNE-BIDS? I understand the issue of keeping things in sync. I am just saying we need a clear policy that is documented and adhere to it. What you're suggesting means that everything in raw.info needs to be overwritten. So it's really not just a question of bads but all the other attributes

hoechenberger · 2020-08-09T20:48:27Z

What you're suggesting means that everything in raw.info needs to be overwritten. So it's really not just a question of bads but all the other attributes

My understanding so far was that our policy in MNE-BIDS is that if we have BIDS metadata for an entity, it will always take precedence over anything in the raw data / raw.info for that entity.

Now when we're looking at channels.tsv and the status column is missing, to me this encodes the exact same information as "all channels are good, as none has been marked as bad", and should take precedence over info['bads']. I do see your point though. Just wanted to explain where my thinking is coming from.

jasmainak · 2020-08-09T20:50:53Z

yeah, I understand. But does taking precedence mean ignoring everything in raw.info or does it mean ignoring raw.info only if BIDS metadata exists? My understanding was the latter. Along which we wrote the code for detecting line freq (and probably other metadata)? This is a good opportunity to form a clear policy and document it somewhere.

hoechenberger · 2020-08-09T20:58:20Z

or does it mean ignoring raw.info only if BIDS metadata exists?

This.

However we seem to have different views on what a missing status column in channels.tsv means :) To me the missing column encodes something too: that all channels are to be considered "good". Your view is that it's missing information, therefore info['bads'] should take precedence.

hoechenberger · 2020-08-09T20:58:58Z

I also think this highlights that the status columnn should be REQUIRED instead of OPTIONAL…

jasmainak · 2020-08-09T21:00:55Z

Yes :) I think we now have a common understanding! Just need to figure out the way forward that everyone agrees with.

agree, the use of OPTIONAL vs RECOMMENDED vs REQUIRED is very much abused in BIDS

hoechenberger · 2020-08-09T21:03:27Z

(btw intuitively I would have NO clue whatsoever what the difference between "optional" and "recommended" would be. This almost seems to imply that there are options which are totally useless, bc otherwise they would be recommended…)

jasmainak · 2020-08-09T21:12:53Z

I believe there were murmurs of throwing warnings with "recommended" in the validator. There might be an open issue? But not sure if it was ever taken care of. Intuitively, recommended is something that's probably missing because of the recording system you are using but if it's there, you really should put that info in.

hoechenberger · 2020-08-09T21:41:52Z

Let's bring @sappelhoff in here 😅

sappelhoff · 2020-08-10T07:18:47Z

or does it mean ignoring raw.info only if BIDS metadata exists?

agreed

To me the missing column encodes something too: that all channels are to be considered "good".

to me a missing column communicates that all row entries would be n/a

I also think this highlights that the status columnn should be REQUIRED instead of OPTIONAL…

but labelling channels as "good" or "bad" is a subjective process (unless a clearly objective policy is documented, preferably in the form of a reproducible algorithm), and there are certainly cases where people want to share data, but not their judgment of channel status --> for example because they didn't check every channel.

In those cases, a REQUIRED status column would just be n/a, which is the same as no status column (to my interpretation)

(btw intuitively I would have NO clue whatsoever what the difference between "optional" and "recommended" would be. This almost seems to imply that there are options which are totally useless, bc otherwise they would be recommended…)

see bids-standard/bids-specification#563 --> OPTIONAL in BIDS is often more like "RECOMMENDED under condition X, but not condition Y", rather than "truly optional".

I believe there were murmurs of throwing warnings with "recommended" in the validator.

that would certainly be reasonable for many of the recommendations. But we have too few contributors who are interested in making the validator strong. I am already very thankful to @adam2392 for starting to make a few PRs and I think we (as MNE-BIDS team) contribute a lot to the validator (and the spec) by pointing out flaws, inconsistencies, and missing coverage. We're at the front line ... so to say.

hoechenberger · 2020-08-10T07:32:57Z

Ok so… now, what should be the behavior if we encounter both, info['bads'] and status in channels.tsv if they don't match?

sappelhoff · 2020-08-10T07:34:15Z

Ok so… now, what should be the behavior if we encounter both, info['bads'] and status in channels.tsv if they don't match?

warn the user about the inconsistency and explicitly use the BIDS information

* Grant channels.tsv sole authority over Good & Bad Closes #498 * New approach based on discussion * Fix * Update tests * Update changelog * Fix comment * Remove line

jasmainak · 2020-08-11T21:58:54Z

based on thinking in #495 I believe mismatch should be a validator error. See for example what they do with niftis

sappelhoff · 2020-08-12T07:47:47Z

For the record: I agree with Mainak on this, but the validator is not "there yet" and until then we need to be pragmatic, so I stay with my earlier recommendation

jasmainak · 2020-08-12T14:22:38Z

sounds good, but the writing API in #491 might introduce these kinds of inconsistencies.

hoechenberger · 2020-08-12T15:30:14Z

sounds good, but the writing API in #491 might introduce these kinds of inconsistencies.

No, it actually does away with it :) I've designed it such that (un)marking channels as "bad" will alter both, channels.tsv and Raw.info['bads'] :)

jasmainak · 2020-08-12T15:46:04Z

ahhh, wonderful!

hoechenberger added the bug label Aug 9, 2020

hoechenberger added a commit to hoechenberger/mne-bids that referenced this issue Aug 9, 2020

Grant channels.tsv sole authority over Good & Bad

217bcff

Closes mne-tools#498

hoechenberger mentioned this issue Aug 9, 2020

[MRG] Rework bad channels population #499

Merged

7 tasks

hoechenberger mentioned this issue Aug 9, 2020

[MRG] Allow marking channels as bad in existing datasets #491

Merged

7 tasks

jasmainak closed this as completed in #499 Aug 10, 2020

jasmainak pushed a commit that referenced this issue Aug 10, 2020

[MRG] Rework bad channels population (#499)

ec031fa

* Grant channels.tsv sole authority over Good & Bad Closes #498 * New approach based on discussion * Fix * Update tests * Update changelog * Fix comment * Remove line

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_raw_bids() respects info['bads'], but it really shouldn't #498

read_raw_bids() respects info['bads'], but it really shouldn't #498

hoechenberger commented Aug 9, 2020

agramfort commented Aug 9, 2020 via email

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

hoechenberger commented Aug 9, 2020 •

edited

Loading

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020 •

edited

Loading

jasmainak commented Aug 9, 2020 •

edited

Loading

hoechenberger commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

sappelhoff commented Aug 10, 2020

hoechenberger commented Aug 10, 2020

sappelhoff commented Aug 10, 2020

jasmainak commented Aug 11, 2020

sappelhoff commented Aug 12, 2020

jasmainak commented Aug 12, 2020

hoechenberger commented Aug 12, 2020

jasmainak commented Aug 12, 2020

read_raw_bids() respects info['bads'], but it really shouldn't #498

read_raw_bids() respects info['bads'], but it really shouldn't #498

Comments

hoechenberger commented Aug 9, 2020

Describe the bug

Steps to reproduce

Expected results

Actual results

Additional information

agramfort commented Aug 9, 2020 via email

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

hoechenberger commented Aug 9, 2020 • edited Loading

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020 • edited Loading

jasmainak commented Aug 9, 2020 • edited Loading

hoechenberger commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

jasmainak commented Aug 9, 2020

hoechenberger commented Aug 9, 2020

sappelhoff commented Aug 10, 2020

hoechenberger commented Aug 10, 2020

sappelhoff commented Aug 10, 2020

jasmainak commented Aug 11, 2020

sappelhoff commented Aug 12, 2020

jasmainak commented Aug 12, 2020

hoechenberger commented Aug 12, 2020

jasmainak commented Aug 12, 2020

hoechenberger commented Aug 9, 2020 •

edited

Loading

hoechenberger commented Aug 9, 2020 •

edited

Loading

jasmainak commented Aug 9, 2020 •

edited

Loading