[MRG] Allow marking channels as bad in existing datasets #491

hoechenberger · 2020-08-06T12:21:12Z

PR Description

I've started working on something we've needed for a long time now: marking channels of an existing BIDS dataset as "bad". This is needed because we'd like to encourage colleagues to convert to BIDS ASAP after data acquisition. Now, during data inspection / processing, they might discover problematic channels, and currently the only way to mark them as bad in the BIDS data is by editing the channels.tsv file(s) – not good! Even worse, some may opt to read the BIDS data, and mark channels as bad only in the derivative data they produce. Therefore we've figured we'd need a way to make it convenient to alter / amend existing metadata.

This implementation is just a first draft, adding a write.mark_bad_channels() function. It will read the relevant channels.tsv file, mark the requested channels as bad, and write the altered metadata back to disk, replacing the existing file. I do realize that what I'm doing here can be further abstracted, e.g. to also allow users to mark bad channels as good; or to update other bits of the metadata as well. However I just wanted to move ahead for now with a concrete implementation that has actual real-world relevance for us and our colleagues. Totally open to rethink and refactor this later, and looking forward to your thoughts and suggestions!

WIP because I also want to add a command line interface, and I haven't run extensive tests so far.

cc @agramfort

Merge checklist

Maintainer, please confirm the following before merging:

All comments resolved
This is not your own PR
All CIs are happy
PR title starts with [MRG]
whats_new.rst is updated
PR description includes phrase "closes <#issue-number>"
Commit history does not contain any merge commits

codecov-commenter · 2020-08-06T12:28:51Z

Codecov Report

Merging #491 into master will decrease coverage by 0.30%.
The diff coverage is 86.73%.

@@            Coverage Diff             @@
##           master     #491      +/-   ##
==========================================
- Coverage   93.45%   93.14%   -0.31%     
==========================================
  Files          14       15       +1     
  Lines        2079     2175      +96     
==========================================
+ Hits         1943     2026      +83     
- Misses        136      149      +13

Impacted Files	Coverage Δ
mne_bids/commands/mne_bids_mark_bad_channels.py	`86.36% <86.36%> (ø)`
mne_bids/write.py	`95.71% <87.03%> (-1.04%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03da314...159bce8. Read the comment docs.

sappelhoff · 2020-08-06T12:45:07Z

good idea, some initial remarks:

if in channels.tsv no status column is found, insert it
allow for providing a reason for why channel is bad? --> write status_description column

not sure about the API for my second point

hoechenberger · 2020-08-06T12:49:03Z

if in channels.tsv no status column is found, insert it

Right. I assumed this column was REQUIRED but just checked the standard again and it is not. So I will insert it if not present and set all channels to "good", except for the ones to be marked as bad.

allow for providing a reason for why channel is bad? --> write status_description column

Good idea, will add.

sappelhoff · 2020-08-06T12:57:29Z

So I will insert it if not present and set all channels to "good", except for the ones to be marked as bad.

I would set as n/a to be safe. n/a is always allowed and seems more appropriate in this case ... something like "we haven't looked"

edit: actually no. That might not be smart. forget what I wrote :-D

hoechenberger · 2020-08-06T13:05:46Z

Forget what exactly? :)

hoechenberger · 2020-08-06T13:17:31Z

If I read the specs correctly, only "good" and "bad" are allowed. I guess this kind of makes sense... anything not bad should be good :)

sappelhoff · 2020-08-06T13:26:12Z

n/a is always allowed, but I figured that when people are explicitly marking some channels as BAD, then it's fair to assume they have checked the rest and evaluated them as GOOD.

I would put n/a when we are unsure whether or not a channel was screened.

hoechenberger · 2020-08-06T17:02:27Z

I need advice on the API.

Because I wanted to make the description parameter optional,
and because I wanted to have it before the bids_basename and bids_root in the function signature,
and since we cannot do anything without bids_basename and bids_root,

I'm now setting bids_basename and bids_root to None by default and raise if they are None when the function is being invoked. This is kind of ugly, but I'm not sure how to do it better – it's a limitation in Python. Or am I missing something obvious here?

hoechenberger · 2020-08-06T17:51:48Z

I'm now setting bids_basename and bids_root to None by default and raise if they are None when the function is being invoked. This is kind of ugly, but I'm not sure how to do it better – it's a limitation in Python. Or am I missing something obvious here?

We could use keyword-only arguments:

def mark_bad_channels(channels, *, descriptions=None, bids_basename,
                      bids_root, kind=None, verbose=True):

This would allow users to pass channels positionally or as a keyword, whichever they prefer; descriptions can be omitted or passed as a keyword; bids_basename and bids_root would be required, but would have to be specified as keywords, which is probably good practice here anyway.

Any objections?

hoechenberger · 2020-08-06T18:04:39Z

Did it such that both channels and descriptions may be passed by position.

hoechenberger · 2020-08-06T19:24:03Z

Removed the WIP flag bc I've decided to add command-line support in a followup PR.

hoechenberger · 2020-08-06T21:03:40Z

I started to wonder if maybe instead of having two arguments channels and descriptions, which are both iterables, we should only have one argument that is a dict in the form {ch_name: description}. This would put a stronger emphasis on the fact that users usually should add a description. Also it would make it easier to visually keep track of which description belongs to which channel, should users need to mark a larger number (>4 or so) as bad.

WDYT?

hoechenberger · 2020-08-07T10:41:49Z

Following a suggestion by @agramfort, I have added an overwrite kwarg which allows users to specify whether they wish to only change the specified channels (overwrite=False, default), or whether to reset all channels, leaving only those passed to the function marked as bad, and marking all others as good. Even more, this now allows users to remove the "bad" status of all channels by passing channels=[], overwrite=True.

hoechenberger · 2020-08-07T11:06:55Z

If CIs pass, this is good to go from my end. I know it's quite a bit of code, but maybe you can start your review by looking at the Examples section I've included in the docstring of the new function.

agramfort

also can you add command line?

see mne_mark_bad_channels from mne-C

mne_bids/tests/test_write.py

mne_bids/write.py

mne_bids/tests/test_write.py

mne_bids/write.py

agramfort · 2020-08-07T13:44:50Z

Unfortunately this won't work as the keys in params are diffeent than what BIDSPath expects (e.g. sub vs subject etc)

really unfortunate design choise :(

adam2392 · 2020-08-07T13:48:25Z

^ what do we think about either making all the entity Params explicit (e.g. subject instead of sub)

Or making it not short hand in the make_bids_basename and Bidspath?

I’m in favor of #1.

agramfort · 2020-08-07T13:50:24Z

any consistent choice is fine with me.

hoechenberger · 2020-08-07T14:01:08Z

^ what do we think about either making all the entity Params explicit (e.g. subject instead of sub)

+1

hoechenberger · 2020-08-07T21:15:33Z

also can you add command line?

Done in 285d0a0

hoechenberger · 2020-08-07T21:16:34Z

Missing functionality in the CLI: passing no bad channels and overwrite=True to reset all channels to "good". Will look into this tomorrow.

sappelhoff

I just checked the API and examples in the docstr, and that looked very good to me. I think this will be very useful.

mne_bids/write.py

mne_bids/commands/mne_bids_mark_bad_channels.py

hoechenberger · 2020-08-08T14:53:57Z

I think I have addressed all review comments and this should be good to merge.

jasmainak · 2020-08-13T05:06:43Z

Pybids has it: https://github.com/bids-standard/pybids/blob/master/examples/pybids_tutorial.ipynb

the closest what we have in mne-bids is get_entity_vals. Not sure to what extent you want to duplicate pybids functionality. Their design is a bit too fancy for my taste ... there is (or used to be) a dependency grabbids which populated methods on demand based on which entities existed in the dataset. For example, if there is an entity called subject you will get a method called get_subjects. The details are a bit hazy in my mind but you have the keywords :)

jasmainak · 2020-08-13T05:09:08Z

Now the question is -- 1) to what extent do you want to duplicate that functionality (cf also reviewer comments in mne-bids paper) rather than work with pybids folks, and 2) can you get something that does the job using regular expressions without employing something as fancy as pybids? Maybe it works only for ephys but that's fine for us

agramfort · 2020-08-13T07:56:13Z

I was suspecting pybids had something like this. I think we can do this with 15 lines of code so I am not pushing for an extra dependency on pybids

…

hoechenberger · 2020-08-13T08:42:38Z

Yes definitely let's look into implementing this ourselves or just borrowing the relevant bits from pybids, but let's try not to introduce this dependency

jasmainak · 2020-08-14T21:34:45Z

yes but definitely look into what pybids does. It will inform you what users want.

hoechenberger · 2020-08-17T06:52:27Z

yes but definitely look into what pybids does. It will inform you what users want.

Ok!

examples/mark_bad_channels.py

mne_bids/commands/mne_bids_mark_bad_channels.py

mne_bids/tests/test_write.py

mne_bids/write.py

agramfort · 2020-09-01T13:56:48Z

mne_bids/write.py

+    # Update info['bads']
+    bads = _get_bads_from_tsv_data(tsv_data)
+    raw.info['bads'] = bads
+    # XXX (How) will this handle split files?


what is your concern?

Oh it's very easy, I've never worked with split files before. So here (actually in the next line, not shown here by github) we're only updating the very first file -- raw.filenames[0]. In my current tests, there is always exactly ONE file in raw.filenames. If we have split files, it could be multiple. I'm not sure if we would have to iterate over all of them, or if only the first one carries a (meaningful) info dict, and updating that one is sufficient.

for a split file then sidecar files should be identical. It's basically one file split over different files. metadata should be therefore exactly the same.

mne_bids/commands/mne_bids_mark_bad_channels.py

mne_bids/write.py

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

mne_bids/commands/mne_bids_mark_bad_channels.py

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

hoechenberger · 2020-09-01T14:54:32Z

@agramfort Unless #491 (comment) needs more work, this is good to be merged once CI goes green.

agramfort · 2020-09-01T14:55:29Z

let's merge when green and let's iterate

…

agramfort · 2020-09-01T15:26:02Z

Thx @hoechenberger

hoechenberger mentioned this pull request Aug 6, 2020

[MRG] Adding status_description to channels.tsv files defaulting to n/a for now #489

Merged

7 tasks

hoechenberger changed the title ~~WIP: Allow marking channels as bad in existing datasets~~ Allow marking channels as bad in existing datasets Aug 6, 2020

agramfort reviewed Aug 7, 2020

View reviewed changes

adam2392 mentioned this pull request Aug 7, 2020

Change parse_bids_fname and any other underlying funcs to use explicit BIDS entities #494

Closed

hoechenberger changed the title ~~Allow marking channels as bad in existing datasets~~ [MRG] Allow marking channels as bad in existing datasets Aug 7, 2020

sappelhoff reviewed Aug 8, 2020

View reviewed changes

mne_bids/write.py Outdated Show resolved Hide resolved

mne_bids/write.py Outdated Show resolved Hide resolved

agramfort reviewed Aug 8, 2020

View reviewed changes

mne_bids/commands/mne_bids_mark_bad_channels.py Outdated Show resolved Hide resolved

hoechenberger force-pushed the mark-bads branch from 1a16386 to f6bc027 Compare August 8, 2020 13:57

agramfort approved these changes Aug 8, 2020

View reviewed changes

hoechenberger added 4 commits August 31, 2020 20:05

Merge branch 'master' of github.com:mne-tools/mne-bids into mark-bads

29e0a3c

Merge branch 'master' of github.com:mne-tools/mne-bids into mark-bads

77fede1

Fully migrate to latest BIDSPath

b6a9a8f

Cleanup

3e24e65

hoechenberger mentioned this pull request Sep 1, 2020

[MRG] BF: Handle extension corrently in BIDSPath.match() #541

Merged

6 tasks

Merge branch 'master' of github.com:mne-tools/mne-bids into mark-bads

7ee78f8

agramfort reviewed Sep 1, 2020

View reviewed changes

hoechenberger added 2 commits September 1, 2020 16:15

Fix CLI

f9027aa

Update example

e018ce6

hoechenberger commented Sep 1, 2020

View reviewed changes

mne_bids/commands/mne_bids_mark_bad_channels.py Outdated Show resolved Hide resolved

hoechenberger commented Sep 1, 2020

View reviewed changes

mne_bids/commands/mne_bids_mark_bad_channels.py Outdated Show resolved Hide resolved

hoechenberger commented Sep 1, 2020

View reviewed changes

mne_bids/write.py Outdated Show resolved Hide resolved

Apply suggestions from code review

8994f9e

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

agramfort reviewed Sep 1, 2020

View reviewed changes

mne_bids/commands/mne_bids_mark_bad_channels.py Outdated Show resolved Hide resolved

Fix typo

7508e28

agramfort approved these changes Sep 1, 2020

View reviewed changes

Typo

c0b5f48

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

hoechenberger changed the title ~~Allow marking channels as bad in existing datasets~~ [MRG] Allow marking channels as bad in existing datasets Sep 1, 2020

Remove XXX

159bce8

agramfort merged commit b25ba5e into mne-tools:master Sep 1, 2020

hoechenberger deleted the mark-bads branch September 1, 2020 15:28

[MRG] Allow marking channels as bad in existing datasets #491

[MRG] Allow marking channels as bad in existing datasets #491

Conversation

hoechenberger commented Aug 6, 2020 • edited Loading

PR Description

Merge checklist

codecov-commenter commented Aug 6, 2020 • edited Loading

Codecov Report

sappelhoff commented Aug 6, 2020

hoechenberger commented Aug 6, 2020

sappelhoff commented Aug 6, 2020 • edited Loading

hoechenberger commented Aug 6, 2020

hoechenberger commented Aug 6, 2020

sappelhoff commented Aug 6, 2020

hoechenberger commented Aug 6, 2020

hoechenberger commented Aug 6, 2020

hoechenberger commented Aug 6, 2020

hoechenberger commented Aug 6, 2020

hoechenberger commented Aug 6, 2020 • edited Loading

hoechenberger commented Aug 7, 2020 • edited Loading

hoechenberger commented Aug 7, 2020

agramfort left a comment

Choose a reason for hiding this comment

agramfort commented Aug 7, 2020 via email

adam2392 commented Aug 7, 2020

agramfort commented Aug 7, 2020 via email

hoechenberger commented Aug 7, 2020

hoechenberger commented Aug 7, 2020

hoechenberger commented Aug 7, 2020

sappelhoff left a comment

Choose a reason for hiding this comment

hoechenberger commented Aug 8, 2020

jasmainak commented Aug 13, 2020

jasmainak commented Aug 13, 2020

agramfort commented Aug 13, 2020 via email

hoechenberger commented Aug 13, 2020

jasmainak commented Aug 14, 2020

hoechenberger commented Aug 17, 2020

agramfort Sep 1, 2020

Choose a reason for hiding this comment

hoechenberger Sep 1, 2020

Choose a reason for hiding this comment

agramfort Sep 1, 2020

Choose a reason for hiding this comment

hoechenberger commented Sep 1, 2020

agramfort commented Sep 1, 2020 via email

agramfort commented Sep 1, 2020

hoechenberger commented Aug 6, 2020 •

edited

Loading

codecov-commenter commented Aug 6, 2020 •

edited

Loading

sappelhoff commented Aug 6, 2020 •

edited

Loading

hoechenberger commented Aug 6, 2020 •

edited

Loading

hoechenberger commented Aug 7, 2020 •

edited

Loading