Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_raw_bids() respects info['bads'], but it really shouldn't #498

Closed
hoechenberger opened this issue Aug 9, 2020 · 26 comments · Fixed by #499
Closed

read_raw_bids() respects info['bads'], but it really shouldn't #498

hoechenberger opened this issue Aug 9, 2020 · 26 comments · Fixed by #499
Labels

Comments

@hoechenberger
Copy link
Member

Describe the bug

While working in #491 I think I've discovered a bug in _handle_channels_reading(), which is used by read_raw_bids(). Specifically, when reading data and populating the list of "bads", MNE-BIDS currently merges the information from _channels.tsv and info['bads'] in the data file. I think this is not correct, and only _channels.tsv should be honored.

mne-bids/mne_bids/read.py

Lines 259 to 263 in 8eec738

bads = np.asarray(channels_dict['name'])[bad_bool]
# merge with bads already present in raw data file (if there are any)
unique_bads = set(raw.info['bads']).union(set(bads))
raw.info['bads'] = list(unique_bads)

Steps to reproduce

Checkout hoechenberger:mark-bads (the branch from #491). Run:

import os.path as op
import mne
from mne_bids import (make_bids_basename, write_raw_bids, read_raw_bids,
                      mark_bad_channels)

data_path = mne.datasets.sample.data_path()
raw_fname = op.join(data_path, 'MEG', 'sample', 'sample_audvis_raw.fif')
bids_root = op.join(data_path, '..', 'MNE-sample-data-bids')
bids_basename = make_bids_basename(subject='01', session='01',
                                   task='audiovisual', run='01')

raw = mne.io.read_raw_fif(raw_fname, verbose=False)
print(f'The following channels are marked as bad: {raw.info["bads"]}')

write_raw_bids(raw, bids_basename=bids_basename, bids_root=bids_root,
               overwrite=True, verbose=False)

# Mark two channels as bad, RETAINING existing bads.
bads = ['MEG 0112', 'MEG 0131']
mark_bad_channels(ch_names=bads, bids_basename=bids_basename,
                  bids_root=bids_root)

raw = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root,
                    verbose=False)
print(f'After marking MEG 0112 and MEG 0131 as bad, the following channels '
      f'are now marked as bad: {raw.info["bads"]}')

# Mark the same channels as bad, NOT RETAINING any existing bads.
bads = ['MEG 0112', 'MEG 0131']
mark_bad_channels(ch_names=bads, bids_basename=bids_basename,
                  bids_root=bids_root, overwrite=True)

raw = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root,
                    verbose=False)
print(f'After marking MEG 0112 and MEG 0131 as bad and passing '
      f'`overwrite=True`, the following channels '
      f'are now marked as bad: {raw.info["bads"]}')

Expected results

The following channels are marked as bad: ['MEG 2443', 'EEG 053']

After marking MEG 0112 and MEG 0131 as bad, the following channels are now marked as bad: ['MEG 0112', 'EEG 053', 'MEG 0131', 'MEG 2443']

After marking MEG 0112 and MEG 0131 as bad and passing `overwrite=True`, the following channels are now marked as bad: ['MEG 0131', 'MEG 0112']

Actual results

The following channels are marked as bad: ['MEG 2443', 'EEG 053']

After marking MEG 0112 and MEG 0131 as bad, the following channels are now marked as bad: ['MEG 0112', 'EEG 053', 'MEG 0131', 'MEG 2443']

After marking MEG 0112 and MEG 0131 as bad and passing `overwrite=True`, the following channels are now marked as bad: ['EEG 053', 'MEG 0131', 'MEG 0112', 'MEG 2443']

Additional information

Looking into sub-01_ses-01_task-audiovisual_run-01_channels.tsv, it's clear that it's being updated correctly:

$ grep bad sub-01_ses-01_task-audiovisual_run-01_channels.tsv
MEG 0112        MEGGRADPLANAR   T/m     0.10000000149011612     172.17630004882812      Planar Gradiometer      600.614990234375        bad     n/a
MEG 0131        MEGMAG  T       0.10000000149011612     172.17630004882812      Magnetometer    600.614990234375        bad     n/a
$
@agramfort
Copy link
Member

agramfort commented Aug 9, 2020 via email

@jasmainak
Copy link
Member

but isn't the STATUS column in BIDS optional?

hoechenberger added a commit to hoechenberger/mne-bids that referenced this issue Aug 9, 2020
@hoechenberger
Copy link
Member Author

but isn't the STATUS column in BIDS optional?

Yes. But if you want to mark channels as bad in BIDS, you will have to bite the bullet and add this column :)

@jasmainak
Copy link
Member

yes, but then we have a lossy reader, it does not get all the information that is there in the file

@jasmainak
Copy link
Member

I think many Elekta users may feel it is sufficient to make the dataset pass the validator but will be surprised that the bad channels are not read in. We shouldn't require the users to know all the BIDS standard by heart and know where to put things.

@hoechenberger
Copy link
Member Author

yes, but then we have a lossy reader, it does not get all the information that is there in the file

We already do, e.g. regarding events…

@hoechenberger
Copy link
Member Author

We shouldn't require the users to know all the BIDS standard by heart and know where to put things.

MNE-BIDS writes the bads to channels.tsv. I would actually suggest we clear info['bads'] upon writing, too.

@hoechenberger
Copy link
Member Author

hoechenberger commented Aug 9, 2020

Following your suggestion would also have implications on #491 – currently I only update the sidecar, but you're suggesting we write an entire updated raw file / info dict to disk as well. I mean, we could do that, but it doesn't seem very elegant…

@hoechenberger
Copy link
Member Author

I'm really just super concerned about keeping everything in sync if we start allowing multiple places to store the same info. Esp since one of them (info['bads']) is lossy in the sense that it cannot hold a status_description.

@jasmainak
Copy link
Member

MNE-BIDS writes the bads to channels.tsv.

now the question is do you want only MNE-BIDS written files to be readable in MNE-BIDS? I understand the issue of keeping things in sync. I am just saying we need a clear policy that is documented and adhere to it. What you're suggesting means that everything in raw.info needs to be overwritten. So it's really not just a question of bads but all the other attributes

@hoechenberger
Copy link
Member Author

hoechenberger commented Aug 9, 2020

What you're suggesting means that everything in raw.info needs to be overwritten. So it's really not just a question of bads but all the other attributes

My understanding so far was that our policy in MNE-BIDS is that if we have BIDS metadata for an entity, it will always take precedence over anything in the raw data / raw.info for that entity.

Now when we're looking at channels.tsv and the status column is missing, to me this encodes the exact same information as "all channels are good, as none has been marked as bad", and should take precedence over info['bads']. I do see your point though. Just wanted to explain where my thinking is coming from.

@jasmainak
Copy link
Member

jasmainak commented Aug 9, 2020

yeah, I understand. But does taking precedence mean ignoring everything in raw.info or does it mean ignoring raw.info only if BIDS metadata exists? My understanding was the latter. Along which we wrote the code for detecting line freq (and probably other metadata)? This is a good opportunity to form a clear policy and document it somewhere.

@hoechenberger
Copy link
Member Author

or does it mean ignoring raw.info only if BIDS metadata exists?

This.

However we seem to have different views on what a missing status column in channels.tsv means :) To me the missing column encodes something too: that all channels are to be considered "good". Your view is that it's missing information, therefore info['bads'] should take precedence.

@hoechenberger
Copy link
Member Author

I also think this highlights that the status columnn should be REQUIRED instead of OPTIONAL…

@jasmainak
Copy link
Member

Yes :) I think we now have a common understanding! Just need to figure out the way forward that everyone agrees with.

agree, the use of OPTIONAL vs RECOMMENDED vs REQUIRED is very much abused in BIDS

@hoechenberger
Copy link
Member Author

(btw intuitively I would have NO clue whatsoever what the difference between "optional" and "recommended" would be. This almost seems to imply that there are options which are totally useless, bc otherwise they would be recommended…)

@jasmainak
Copy link
Member

I believe there were murmurs of throwing warnings with "recommended" in the validator. There might be an open issue? But not sure if it was ever taken care of. Intuitively, recommended is something that's probably missing because of the recording system you are using but if it's there, you really should put that info in.

@hoechenberger
Copy link
Member Author

Let's bring @sappelhoff in here 😅

@sappelhoff
Copy link
Member

or does it mean ignoring raw.info only if BIDS metadata exists?

agreed

To me the missing column encodes something too: that all channels are to be considered "good".

to me a missing column communicates that all row entries would be n/a

I also think this highlights that the status columnn should be REQUIRED instead of OPTIONAL…

but labelling channels as "good" or "bad" is a subjective process (unless a clearly objective policy is documented, preferably in the form of a reproducible algorithm), and there are certainly cases where people want to share data, but not their judgment of channel status --> for example because they didn't check every channel.

In those cases, a REQUIRED status column would just be n/a, which is the same as no status column (to my interpretation)

(btw intuitively I would have NO clue whatsoever what the difference between "optional" and "recommended" would be. This almost seems to imply that there are options which are totally useless, bc otherwise they would be recommended…)

see bids-standard/bids-specification#563 --> OPTIONAL in BIDS is often more like "RECOMMENDED under condition X, but not condition Y", rather than "truly optional".

I believe there were murmurs of throwing warnings with "recommended" in the validator.

that would certainly be reasonable for many of the recommendations. But we have too few contributors who are interested in making the validator strong. I am already very thankful to @adam2392 for starting to make a few PRs and I think we (as MNE-BIDS team) contribute a lot to the validator (and the spec) by pointing out flaws, inconsistencies, and missing coverage. We're at the front line ... so to say.

@hoechenberger
Copy link
Member Author

Ok so… now, what should be the behavior if we encounter both, info['bads'] and status in channels.tsv if they don't match?

@sappelhoff
Copy link
Member

Ok so… now, what should be the behavior if we encounter both, info['bads'] and status in channels.tsv if they don't match?

warn the user about the inconsistency and explicitly use the BIDS information

jasmainak pushed a commit that referenced this issue Aug 10, 2020
* Grant channels.tsv sole authority over Good & Bad

Closes #498

* New approach based on discussion

* Fix

* Update tests

* Update changelog

* Fix comment

* Remove line
@jasmainak
Copy link
Member

based on thinking in #495 I believe mismatch should be a validator error. See for example what they do with niftis

@sappelhoff
Copy link
Member

For the record: I agree with Mainak on this, but the validator is not "there yet" and until then we need to be pragmatic, so I stay with my earlier recommendation

@jasmainak
Copy link
Member

sounds good, but the writing API in #491 might introduce these kinds of inconsistencies.

@hoechenberger
Copy link
Member Author

sounds good, but the writing API in #491 might introduce these kinds of inconsistencies.

No, it actually does away with it :) I've designed it such that (un)marking channels as "bad" will alter both, channels.tsv and Raw.info['bads'] :)

@jasmainak
Copy link
Member

ahhh, wonderful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants