Add the possibility to split multi-run physiological recordings #206

sangfrois · 2020-04-07T21:16:34Z

Close #36
These changes are made so that phys2bids workflow can split multi-run recordings into different BlueprintInput objects based on run's start and end indexes (plus or minus a padding given by the user), then deal with each one of them independently - and save them in BIDS format.

Proposed changes

General flake8-isation of docstrings and code.

Adjust the CLI

in run.py script : change parser arguments.
- -ntp becomes a list of number of trigger timepoints expected in each run.
- -tr becomes a list of TR (maybe there are different sequences).
- -padding is an optional value that contains the padding to add before and after each run (default to 9s).

Adjust Blueprint objects

The outputs of check_trigger_amount in BlueprintInput are now arguments that can be passed to the object, since they have to be overwritten if multi-runs are present.
filename, the name of the output file without extension, becomes an attribute of the BlueprintOutput object.

Adjust the main workflow

in phys2bids.py, integrate a section dedicated to handle multi-run user inputs.
- Check tr and ntp list length equivalency and raise appropriate errors. If tr has one element, repeat it for the numbers of runs in -ntp.
- Check if the sum of expected timepoints equals num_timepoints_found, if not raise an error and stop (otherwise the risk of propagating errors is high).
- phys_in becomes a dictionary after the workflow checks whether there are multiple runs or not, to better handle the possibility of multiple runs.
- phys_out keys become based on frequencies and runs (to handle multi-frequency multi-run files).
- In order to avoid overwriting, the function checks if the name of an output is already in use between different stances of phys_out.

Changes to `bids_unit.py` and creation of a function that deals with all the bids-related steps

Rename bids_unit.py to bids.py. Equally, the tests change names too
Move the handle of heuristic and renaming in this new function. The tests for that function change place too!
In order to avoid overwriting,

Change to heuristics

heuristics now can optionally accept run as an argument to deal with multiple runs.

Adjust `viz.py`

There is now a function that calls plot_trigger (due to the repetition of code in the main workflow) that is called from the main workflow.

Add a utility that finds runs start and end, split the input object into multiple input objects

in splice4phys.py script
- A new function that find beginning and ends of runs find_runs
  - function argument: phys_in (or BlueprintInput object), tr, ntp (as lists), thr and padding (default : 9s)
  - function output: dictionary run_timestamps{run_idx:(start, end), run_idx:...}
- A new function that split a multi-run input into multiple BlueprintInput objects (inside a dictionary).

Create output objects based on runs starts and ends.

sangfrois · 2020-04-08T02:06:36Z

TO-DO List

CLI parser for multi-run recordings
- Integrate argument parsing from run.py
- Adapt timepoints and TR args to multi-runs with list type
split2phys function
- integrate arguments needed
- Code the thing, i.e. Trigger start/end indexes via BlueprintInput()
Feed split2phys output dicts to phys2bids

eurunuela · 2020-04-08T07:08:06Z

Guys, I feel like this should be another package following #186...

From what I understand, phys2bids converts physio data into BIDS format. split2bids on the other hand, is another tool to have the data ready before running phys2bids.

Idk how you guys feel about it, but to me, that's what makes the most sense.

smoia · 2020-04-08T09:56:29Z

Don't worry @eurunuela (and sorry @sangfrois, I promised you I would have taken care of the description here and didn't yet)!

We came up with this solution as the easiest way to integrate a splitting file function during the BrainWeb event. We're still not 100% sure on how to integrate this workflow with phys2bids (should it call phys2bids? should phys2bids call this workflow?), which is why it's a Draft PR here. There will be plenty of time and occasions to discuss about it, first within the BrainWeb team, then with the rest of the developers.

One thing that we were thinking, though, is that while most of the other libraries will be able to work with BIDSified data (as we provide a BIDSifier here), split2phys (name in progress) and phys2bids will be the only two workflows dealing with non-bidsified data. Moreover, most of split2phys is a copy-paste of phys2bids, so probably we could integrate them better in just one package.

vinferrer · 2020-04-08T10:03:20Z

that makes sense, let us now when is ready for review

smoia · 2020-04-08T10:04:48Z

Sure! Don't worry, it's going to be in the next 1-12 months 😜
If you want to jump in the development and discussion, it's in the mattermost channel and/or on jitsi.

eurunuela · 2020-04-08T10:13:45Z

Could you please share the links to the mattermost and jitsi?

I wanted to take part in the BrainWeb but I'm having big issues with my internet connection (my speed was 1 MB/s yesterday, not ideal for a hackathon). So I'm gonna try to follow what's going on in those channels.

Thanks!

smoia · 2020-04-08T10:24:31Z

Sure! I think you have everything in the README here

eurunuela · 2020-04-08T10:28:14Z

Oh, this is new, cool! Thanks Stefano ;)

…plit_utility

General reordering, comments and corrections

RayStick · 2020-04-09T18:27:35Z

In OSF I have put a new long recording that includes 4 scans, that may help with this PR.

four_scans_samefreq.json
four_scans_samefreq_allchannels.txt
four_scans_samefreq_onlytrigger.txt

"_allchannels" exported txt file includes all the physiological channels that were recorded (Trigger, CO2, O2, Pulse) and "_onlytrigger" is as it sounds - only includes the trigger channel in the export.

smoia · 2020-06-18T18:17:03Z

Ok. Let me run those tests locally (luckily they should be fast now)!

smoia · 2020-06-22T00:31:48Z

So, there's two good news!

Good news is: tests are passing locally! 🎉 🎉 🎉

Other good news: Travis doesn't fail any more due to issues in matplotlib/numpy dependencies. This was happening because (idk why) the python3.7 test was installing numpy 1.15 and matplotlib 3.3.0rc1 - apparently that was a bad combo.
possibly that was raising issues in the 3.6 test that was erroring.

I don't understand why though. By default pip should install the latest, non-rc package (so numpy 1.19 and matplotlib 3.2.2).

Anyhow, I forced the requirements to skip matplotlib 3.3.0rc1 so issues should be solved (at least momentarily). I would like to know if there is any way to skip rc for good!

@eurunuela and @vinferrer or @rmarkello , this is finally, totally ready for reviews. Let's not merge anything else in the meantime.
If anyone can give me one approval, I will merge this in and cut a new release before OHBM.

eurunuela

Newest changes look good to me! Thank you @smoia @sangfrois for this huge PR. I believe @vinferrer is traveling home today and will not be able to review (probably) until tomorrow.

Stale

vinferrer

LGTM. One thing I suppose the way this is implemented, if a file has multiple frequencies, Is the channel frequency the same in all the runs contained in one file? just asking.

vinferrer · 2020-06-22T15:40:35Z

I got home at 16:00

vinferrer · 2020-06-22T15:41:37Z

Don't forget to open a test issue, we have a drop on coverage

smoia · 2020-06-22T15:51:22Z

LGTM. One thing I suppose the way this is implemented, if a file has multiple frequencies, Is the channel frequency the same in all the runs contained in one file? just asking.

If the question is: "is a channel frequency stable across the whole acquisition?", then I would say yes, but it's a good question. AFAIK it's not possible to change sampling rate midway through acquisition, but @sangfrois and @RayStick things might be different in labchart or newer versions of AcqKnowledge.

Don't forget to open a test issue, we have a drop on coverage

Already did! It should be #230 . We also need to add documentation, I'm going to open an issue.

Thank you both for the quick reply! @eurunuela I guess we can merge and celebrate!

eurunuela · 2020-06-22T16:07:36Z

Go on @smoia , enjoy the merging! 😉

RayStick · 2020-06-22T21:31:26Z

If the question is: "is a channel frequency stable across the whole acquisition?", then I would say yes, but it's a good question. AFAIK it's not possible to change sampling rate midway through acquisition, but @sangfrois and @RayStick things might be different in labchart or newer versions of AcqKnowledge.

I don't think it is possible to change the sampling frequency midway through acquisition with LabChart, or if it is, I am not sure why you would want to. If we wanted to, we'd just stop recording and start a new recording. So I think you can assume that it will be the same sampling frequency across all runs within a single file.

[ENHANC] intializing split utility

88f12a1

smoia changed the title ~~[ENHANC] intializing split utility~~ Add workflow to split multi-run physiological recordings Apr 7, 2020

[ENHANC] adding elements of the plan

ee685c2

smoia assigned smoia and sangfrois Apr 7, 2020

smoia added this to the The BrainWeb milestone Apr 7, 2020

smoia added the BrainHack This issue is suggested for BrainHack participants! label Apr 7, 2020

ephemeral issue_36.md added in root for planif. Linter Revisions

3215022

sangfrois and others added 6 commits April 8, 2020 15:06

updating step two, specified ways2integrate PR and possible names

e494d3e

fixed typos and syntax error

cd0c0a9

Merge branch 'master' of https://github.com/physiopy/phys2bids into s…

1d3c45b

…plit_utility

fixed typos and updated 2nd part of plan

7759012

Add trig_idx as BlueprintInput attribute

1168258

General reordering, comments and corrections

e1b7e37

sangfrois force-pushed the split_utility branch from fbdab1c to 7759012 Compare April 9, 2020 04:03

sangfrois and others added 6 commits April 9, 2020 00:04

Merge pull request #2 from smoia/enh/split4phys

0ff1b24

General reordering, comments and corrections

Finish adding trig_idx attribute

5610ec9

implementing comments from stefano - corrections

232dea9

trimming and detailed comments

ba8b233

trimming and detailed comments - additions

de4e8f6

lintering

80c43f2

smoia force-pushed the split_utility branch from 07371b5 to 68a77da Compare June 18, 2020 18:05

Fix paths

ec3bc0d

smoia force-pushed the split_utility branch from 162bd03 to ec3bc0d Compare June 19, 2020 09:13

Stefano Moia added 15 commits June 22, 2020 00:23

Merge branch 'master' into enh/split4phys

dff0b14

Adapt use_heuristic call to new heuristic call, remove unused library

c62a1be

Add "run" as argument in new heuristic

5da8f09

Format filename frequency as integer

797401f

Format frequency as integer for heuristic

4595150

Optimise run_amount

747e9bb

Optimise metadata creation

30e3f14

Fix paths

c379fc7

Fix log name

c59c40b

Rename multifreq files as {freq}Hz

5aedebb

Adapt names

4d2d15d

Lint

aaf8624

Force pip to skip release candidates

bf943eb

Force pip to skip release candidates

4694d4d

Force pip to skip matplotlib 3.3.0rc1

2d56fd3

eurunuela approved these changes Jun 22, 2020

View reviewed changes

vinferrer approved these changes Jun 22, 2020

View reviewed changes

smoia merged commit 70e58a7 into physiopy:master Jun 22, 2020

smoia added the released This issue/pull request has been released. label Oct 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the possibility to split multi-run physiological recordings #206

Add the possibility to split multi-run physiological recordings #206

sangfrois commented Apr 7, 2020 •

edited by smoia

Loading

sangfrois commented Apr 8, 2020 •

edited

Loading

eurunuela commented Apr 8, 2020

smoia commented Apr 8, 2020 •

edited

Loading

vinferrer commented Apr 8, 2020

smoia commented Apr 8, 2020

eurunuela commented Apr 8, 2020

smoia commented Apr 8, 2020

eurunuela commented Apr 8, 2020

RayStick commented Apr 9, 2020

smoia commented Jun 18, 2020

smoia commented Jun 22, 2020

eurunuela left a comment

vinferrer left a comment

vinferrer commented Jun 22, 2020

vinferrer commented Jun 22, 2020

smoia commented Jun 22, 2020

eurunuela commented Jun 22, 2020

RayStick commented Jun 22, 2020 •

edited

Loading

Add the possibility to split multi-run physiological recordings #206

Add the possibility to split multi-run physiological recordings #206

Conversation

sangfrois commented Apr 7, 2020 • edited by smoia Loading

Proposed changes

General flake8-isation of docstrings and code.

Adjust the CLI

Adjust Blueprint objects

Adjust the main workflow

Changes to bids_unit.py and creation of a function that deals with all the bids-related steps

Change to heuristics

Adjust viz.py

Add a utility that finds runs start and end, split the input object into multiple input objects

Create output objects based on runs starts and ends.

sangfrois commented Apr 8, 2020 • edited Loading

TO-DO List

eurunuela commented Apr 8, 2020

smoia commented Apr 8, 2020 • edited Loading

vinferrer commented Apr 8, 2020

smoia commented Apr 8, 2020

eurunuela commented Apr 8, 2020

smoia commented Apr 8, 2020

eurunuela commented Apr 8, 2020

RayStick commented Apr 9, 2020

smoia commented Jun 18, 2020

smoia commented Jun 22, 2020

eurunuela left a comment

Choose a reason for hiding this comment

vinferrer left a comment

Choose a reason for hiding this comment

vinferrer commented Jun 22, 2020

vinferrer commented Jun 22, 2020

smoia commented Jun 22, 2020

eurunuela commented Jun 22, 2020

RayStick commented Jun 22, 2020 • edited Loading

sangfrois commented Apr 7, 2020 •

edited by smoia

Loading

Changes to `bids_unit.py` and creation of a function that deals with all the bids-related steps

Adjust `viz.py`

sangfrois commented Apr 8, 2020 •

edited

Loading

smoia commented Apr 8, 2020 •

edited

Loading

RayStick commented Jun 22, 2020 •

edited

Loading