[MAINT] make raw_to_bids more modular #98

jasmainak · 2018-09-22T04:10:04Z

As I complained in many other pull requests about raw_to_bids growing number of arguments, I am raising an issue about it. I think we should break down raw_to_bids into these functions (subject to discussion):

raw = mne_bids.io.read_raw(raw_fname, ...)   # with different arguments for reading filenames
raw_bids_fname = make_bids_filename(subject, session, task, acquisition)
write_raw_bids(raw, raw_bids_fname, events)

What do you think @teonbrooks @sappelhoff @monkeyman192 ?

The text was updated successfully, but these errors were encountered:

monkeyman192 · 2018-09-23T06:16:11Z

will this even work?
There are quite a few different file names that need to be produced in quite a variety of ways, I feel like passing a single file name/path could make the code in the write_raw_bids quite messy.
One solution could be to do something like what I did a little while ago (but scrapped because it wasn't really a good solution) which was to make a class which could generate all the file paths are required.
If we did this again (eg. on that second line), then we could still have clean code and reduce the number of arguments, but it would mean we need to create a class to handle the file path generation.

jasmainak · 2018-09-23T06:20:16Z

will this even work?
There are quite a few different file names that need to be produced in quite a variety of ways, I feel like passing a single file name/path could make the code in the write_raw_bids quite messy.

The way I think of this is that you could consider having a small private function that extracts session, task, acquisition etc. given the filename. The first line of write_raw_bids calls this function and the rest works as usual. Maybe I'm oversimplifying but that's the general idea I'm hinting towards.

One solution could be to do something like what I did a little while ago (but scrapped because it wasn't really a good solution) which was to make a class

you're going to have a tough time trying to convince @teonbrooks for a class :)

jasmainak · 2018-09-23T06:27:33Z

Also, with the current API, we are mixing up two different actions/things: one is a general raw reader and the other is converting + saving raw.info into BIDS.

monkeyman192 · 2018-09-23T06:33:15Z

The way I think of this is that you could consider having a small private function that extracts session, task, acquisition etc. given the filename

so you essentially construct a file name, then would have a function that would pretty much just pull it apart again to construct the actually required file paths. You might as well just pass the parameters grouped together in a dictionary or something and save yourself the effort of having to extract the info.

you're going to have a tough time trying to convince @teonbrooks for a class :)

I wasn't exactly proposing it again, just saying that I thought it may be needed if we wanted that kind of functionality.

jasmainak · 2018-09-23T07:15:15Z

You might as well just pass the parameters grouped together in a dictionary or something and save yourself the effort of having to extract the info.

Yes, but I see this option to be more user friendly. I'm trying to get this closer to the idea of a write/save function where you only need to provide just the filename. It's something people are used to ...

sappelhoff · 2018-09-23T08:06:36Z

I like @monkeyman192 's earlier idea of keeping raw_to_bids as it is and using modality specific dictionaries as input parameters to raw_to_bids.

Example:
eeg_reference and eeg_ground parameters get summarized by eeg_inputs, which is a dict: {'EEGReference': 'Cz', 'EEGGround': 'Fz'}

Like that we can maintain what's already working and address the issue of a growing number of parameters.

jasmainak · 2018-09-23T08:22:00Z

How would you document this? Can you show me an example docstring?

sappelhoff · 2018-09-23T08:31:53Z

def raw_to_bids(subject_id, task, raw_file, output_path, session_id=None,
                run=None, kind='meg', events_data=None, event_id=None,
                meg_inputs=None, eeg_inputs=None, ieeg_inputs=None,
                config=None, overwrite=True, verbose=True):
    """Walk over a folder of files and create BIDS compatible folder.

    Parameters
    ----------
    subject_id : str
        The subject name in BIDS compatible format ('01', '02', etc.)
    task : str
        Name of the task the data is based on.
    raw_file : str | instance of mne.Raw
        The raw data. If a string, it is assumed to be the path to the raw data
        file. Otherwise it must be an instance of mne.Raw
    output_path : str
        The path of the BIDS compatible folder
    session_id : str | None
        The session name in BIDS compatible format.
    run : int | None
        The run number for this dataset.
    kind : str, one of ('meg', 'eeg', 'ieeg')
        The kind of data being converted. Defaults to "meg".
    events_data : str | array | None
        The events file. If a string, a path to the events file. If an array,
        the MNE events array (shape n_events, 3). If None, events will be
        inferred from the stim channel using `mne.find_events`.
    event_id : dict | None
        The event id dict used to create a 'trial_type' column in events.tsv
    meg_inputs : dict | None
        MEG-specific inputs necessary for BIDS formatting, required if kind=='meg'.
        Contains the following keys: hpi, electrode, hsp.
        hpi : None | str | list of str
            Marker points representing the location of the marker coils with
            respect to the MEG Sensors, or path to a marker file.
            If list, all of the markers will be averaged together.
        electrode : None | str
            Digitizer points representing the location of the fiducials and the
            marker coils with respect to the digitized head shape, or path to a
            file containing these points.
        hsp : None | str | array, shape = (n_points, 3)
            Digitizer head shape points, or path to head shape file. If more than
            10`000 points are in the head shape, they are automatically decimated.
    eeg_inputs : dict | None
        EEG-specific inputs necessary for BIDS formatting, required if kind=='meg'.
        Contains the following keys: eeg_ref, eeg_ground
        eeg_ref : str
            Description of the type of reference used and (when applicable) of
            location of the reference electrode. Defaults to None.
        eeg_gnd : str
            Description  of the location of the ground electrode. Defaults to None.
    ieeg_inputs : dict | None
        IEEG-specific inputs necessary for BIDS formatting, required if kind=='meg'.
        Contains the following keys: ieeg_ref, ieeg_ground

....


    config : str | None
        A path to the configuration file to use if the data is from a BTi
        system.
    overwrite : bool
        If the file already exists, whether to overwrite it.
    verbose : bool
        If verbose is True, this will print a snippet of the sidecar files. If
        False, no content will be printed.

In this simple example we already replaced 5 parameters with three ... and these three new parameters have a lot of capacity for more potential inputs

jasmainak · 2018-09-23T08:42:07Z

umm but now you can't have default parameters anymore + your docstring is nested + your code is still as complicated as before ...

jasmainak · 2018-09-23T08:45:25Z

To illustrate the issue of default parameters, if I do:

meg_inputs = dict(hpi=hpi, hsp=hsp)
raw_to_bids(..., meg_inputs, ...)

this won't work with existing code

sappelhoff · 2018-09-23T08:55:33Z

okay fair enough, I kind of agree and I can't even find proper guidance on how to elegantly format such a docstring. It's a sign that this kind of API is not used broadly (and thus possibly a bad idea)

monkeyman192 · 2018-09-27T01:39:04Z

I feel like some of these problems wouldn't exist if the Raw object actually contained the file paths to the objects that were used to construct it (which is a point I have made before).
We have 3 extra arguments for MEG data for example just because we need to specify hpi, hsp and electrode so that this information can be converted to BIDS format correctly. If these paths were stored in the raw object then we wouldn't have a problem.
This would however require the function to operate solely on raw objects (or I guess for .fif/.ds (and probably other) files you could pass just the file, but then it becomes exception based), and would require reworking of stuff in MNE, but It would alleviate some of the problems if the raw data actually contained all the information about what was used to construct it.

Also, if you really wanted to allow for backward compatibility whilst still allowing a new dictionary like meg_inputs you could add a (very dirty) **kwargs to the args to catch anything that isn't otherwise specified.

But I would also raise the point of, well how many people do we think currently use the code, and would it not be better to try and sort out having a more robust way for the function to have arguments now before we release it and it grows more? If we just leave it and then further arguments are needed, if more people are using it it becomes even more annoying to change down the line.

jasmainak · 2018-09-27T02:47:23Z

We can't have kwargs. It's evil :) Matplotlib uses it, and it's very hard to find documentation. Matplotlib still struggles with documentation to this day ...

jasmainak · 2018-10-02T07:37:51Z

This is a blocker for our release. I'll try to flesh out a pull request sometime this week.

teonbrooks · 2018-10-02T15:00:46Z

ok @jasmainak, i will then become Mainak for the review ;) KISS YAGNI!

jasmainak · 2018-10-03T20:12:21Z

yes I need that :) I'm getting old reviewing code and replying to emails ...

teonbrooks · 2018-10-03T20:42:55Z

@jasmainak still waiting for you to close #73 😂

jasmainak · 2018-10-03T21:03:55Z

alright I started it, hope to finish it later today :-)

jasmainak mentioned this issue Sep 22, 2018

Added acquisition parameter to raw_to_bids #93

Merged

sappelhoff added the maint label Sep 23, 2018

jasmainak mentioned this issue Oct 4, 2018

[MRG] Simplify raw_to_bids #106

Merged

teonbrooks mentioned this issue Oct 25, 2018

WIP: mock up of new API #123

Closed

teonbrooks closed this as completed in #106 Nov 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MAINT] make raw_to_bids more modular #98

[MAINT] make raw_to_bids more modular #98

jasmainak commented Sep 22, 2018

monkeyman192 commented Sep 23, 2018

jasmainak commented Sep 23, 2018

jasmainak commented Sep 23, 2018 •

edited

Loading

monkeyman192 commented Sep 23, 2018

jasmainak commented Sep 23, 2018

sappelhoff commented Sep 23, 2018

jasmainak commented Sep 23, 2018 •

edited

Loading

sappelhoff commented Sep 23, 2018

jasmainak commented Sep 23, 2018

jasmainak commented Sep 23, 2018

sappelhoff commented Sep 23, 2018

monkeyman192 commented Sep 27, 2018

jasmainak commented Sep 27, 2018

jasmainak commented Oct 2, 2018

teonbrooks commented Oct 2, 2018

jasmainak commented Oct 3, 2018

teonbrooks commented Oct 3, 2018 •

edited

Loading

jasmainak commented Oct 3, 2018

[MAINT] make raw_to_bids more modular #98

[MAINT] make raw_to_bids more modular #98

Comments

jasmainak commented Sep 22, 2018

monkeyman192 commented Sep 23, 2018

jasmainak commented Sep 23, 2018

jasmainak commented Sep 23, 2018 • edited Loading

monkeyman192 commented Sep 23, 2018

jasmainak commented Sep 23, 2018

sappelhoff commented Sep 23, 2018

jasmainak commented Sep 23, 2018 • edited Loading

sappelhoff commented Sep 23, 2018

jasmainak commented Sep 23, 2018

jasmainak commented Sep 23, 2018

sappelhoff commented Sep 23, 2018

monkeyman192 commented Sep 27, 2018

jasmainak commented Sep 27, 2018

jasmainak commented Oct 2, 2018

teonbrooks commented Oct 2, 2018

jasmainak commented Oct 3, 2018

teonbrooks commented Oct 3, 2018 • edited Loading

jasmainak commented Oct 3, 2018

jasmainak commented Sep 23, 2018 •

edited

Loading

jasmainak commented Sep 23, 2018 •

edited

Loading

teonbrooks commented Oct 3, 2018 •

edited

Loading