Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the possibility to split multi-run physiological recordings #206

Merged
merged 210 commits into from
Jun 22, 2020

Conversation

sangfrois
Copy link
Member

@sangfrois sangfrois commented Apr 7, 2020

Close #36
These changes are made so that phys2bids workflow can split multi-run recordings into different BlueprintInput objects based on run's start and end indexes (plus or minus a padding given by the user), then deal with each one of them independently - and save them in BIDS format.

Proposed changes

General flake8-isation of docstrings and code.

Adjust the CLI

  • in run.py script : change parser arguments.
    • -ntp becomes a list of number of trigger timepoints expected in each run.
    • -tr becomes a list of TR (maybe there are different sequences).
    • -padding is an optional value that contains the padding to add before and after each run (default to 9s).

Adjust Blueprint objects

  • The outputs of check_trigger_amount in BlueprintInput are now arguments that can be passed to the object, since they have to be overwritten if multi-runs are present.
  • filename, the name of the output file without extension, becomes an attribute of the BlueprintOutput object.

Adjust the main workflow

  • in phys2bids.py, integrate a section dedicated to handle multi-run user inputs.
    • Check tr and ntp list length equivalency and raise appropriate errors. If tr has one element, repeat it for the numbers of runs in -ntp.
    • Check if the sum of expected timepoints equals num_timepoints_found, if not raise an error and stop (otherwise the risk of propagating errors is high).
    • phys_in becomes a dictionary after the workflow checks whether there are multiple runs or not, to better handle the possibility of multiple runs.
    • phys_out keys become based on frequencies and runs (to handle multi-frequency multi-run files).
    • In order to avoid overwriting, the function checks if the name of an output is already in use between different stances of phys_out.

Changes to bids_unit.py and creation of a function that deals with all the bids-related steps

  • Rename bids_unit.py to bids.py. Equally, the tests change names too
  • Move the handle of heuristic and renaming in this new function. The tests for that function change place too!
  • In order to avoid overwriting,

Change to heuristics

  • heuristics now can optionally accept run as an argument to deal with multiple runs.

Adjust viz.py

  • There is now a function that calls plot_trigger (due to the repetition of code in the main workflow) that is called from the main workflow.

Add a utility that finds runs start and end, split the input object into multiple input objects

  • in splice4phys.py script
    • A new function that find beginning and ends of runs find_runs
      • function argument: phys_in (or BlueprintInput object), tr, ntp (as lists), thr and padding (default : 9s)
      • function output: dictionary run_timestamps{run_idx:(start, end), run_idx:...}
    • A new function that split a multi-run input into multiple BlueprintInput objects (inside a dictionary).

Create output objects based on runs starts and ends.

@smoia smoia changed the title [ENHANC] intializing split utility Add workflow to split multi-run physiological recordings Apr 7, 2020
@smoia smoia added this to the The BrainWeb milestone Apr 7, 2020
@smoia smoia added the BrainHack This issue is suggested for BrainHack participants! label Apr 7, 2020
@sangfrois
Copy link
Member Author

sangfrois commented Apr 8, 2020

TO-DO List

  • CLI parser for multi-run recordings
    • Integrate argument parsing from run.py
    • Adapt timepoints and TR args to multi-runs with list type
  • split2phys function
    • integrate arguments needed
    • Code the thing, i.e. Trigger start/end indexes via BlueprintInput()
  • Feed split2phys output dicts to phys2bids

@eurunuela
Copy link
Collaborator

Guys, I feel like this should be another package following #186...

From what I understand, phys2bids converts physio data into BIDS format. split2bids on the other hand, is another tool to have the data ready before running phys2bids.

Idk how you guys feel about it, but to me, that's what makes the most sense.

@smoia
Copy link
Member

smoia commented Apr 8, 2020

Don't worry @eurunuela (and sorry @sangfrois, I promised you I would have taken care of the description here and didn't yet)!

We came up with this solution as the easiest way to integrate a splitting file function during the BrainWeb event. We're still not 100% sure on how to integrate this workflow with phys2bids (should it call phys2bids? should phys2bids call this workflow?), which is why it's a Draft PR here. There will be plenty of time and occasions to discuss about it, first within the BrainWeb team, then with the rest of the developers.

One thing that we were thinking, though, is that while most of the other libraries will be able to work with BIDSified data (as we provide a BIDSifier here), split2phys (name in progress) and phys2bids will be the only two workflows dealing with non-bidsified data. Moreover, most of split2phys is a copy-paste of phys2bids, so probably we could integrate them better in just one package.

@vinferrer
Copy link
Collaborator

that makes sense, let us now when is ready for review

@smoia
Copy link
Member

smoia commented Apr 8, 2020

Sure! Don't worry, it's going to be in the next 1-12 months 😜
If you want to jump in the development and discussion, it's in the mattermost channel and/or on jitsi.

@eurunuela
Copy link
Collaborator

Could you please share the links to the mattermost and jitsi?

I wanted to take part in the BrainWeb but I'm having big issues with my internet connection (my speed was 1 MB/s yesterday, not ideal for a hackathon). So I'm gonna try to follow what's going on in those channels.

Thanks!

@smoia
Copy link
Member

smoia commented Apr 8, 2020

Sure! I think you have everything in the README here

@eurunuela
Copy link
Collaborator

Oh, this is new, cool! Thanks Stefano ;)

@RayStick
Copy link
Member

RayStick commented Apr 9, 2020

In OSF I have put a new long recording that includes 4 scans, that may help with this PR.

four_scans_samefreq.json
four_scans_samefreq_allchannels.txt
four_scans_samefreq_onlytrigger.txt

"_allchannels" exported txt file includes all the physiological channels that were recorded (Trigger, CO2, O2, Pulse) and "_onlytrigger" is as it sounds - only includes the trigger channel in the export.

@smoia
Copy link
Member

smoia commented Jun 18, 2020

Ok. Let me run those tests locally (luckily they should be fast now)!

@smoia
Copy link
Member

smoia commented Jun 22, 2020

So, there's two good news!

Good news is: tests are passing locally! 🎉 🎉 🎉

Other good news: Travis doesn't fail any more due to issues in matplotlib/numpy dependencies. This was happening because (idk why) the python3.7 test was installing numpy 1.15 and matplotlib 3.3.0rc1 - apparently that was a bad combo.
possibly that was raising issues in the 3.6 test that was erroring.

I don't understand why though. By default pip should install the latest, non-rc package (so numpy 1.19 and matplotlib 3.2.2).

Anyhow, I forced the requirements to skip matplotlib 3.3.0rc1 so issues should be solved (at least momentarily). I would like to know if there is any way to skip rc for good!

@eurunuela and @vinferrer or @rmarkello , this is finally, totally ready for reviews. Let's not merge anything else in the meantime.
If anyone can give me one approval, I will merge this in and cut a new release before OHBM.

Copy link
Collaborator

@eurunuela eurunuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newest changes look good to me! Thank you @smoia @sangfrois for this huge PR. I believe @vinferrer is traveling home today and will not be able to review (probably) until tomorrow.

Copy link
Collaborator

@vinferrer vinferrer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One thing I suppose the way this is implemented, if a file has multiple frequencies, Is the channel frequency the same in all the runs contained in one file? just asking.

@vinferrer
Copy link
Collaborator

I got home at 16:00

@vinferrer
Copy link
Collaborator

Don't forget to open a test issue, we have a drop on coverage

@smoia
Copy link
Member

smoia commented Jun 22, 2020

LGTM. One thing I suppose the way this is implemented, if a file has multiple frequencies, Is the channel frequency the same in all the runs contained in one file? just asking.

If the question is: "is a channel frequency stable across the whole acquisition?", then I would say yes, but it's a good question. AFAIK it's not possible to change sampling rate midway through acquisition, but @sangfrois and @RayStick things might be different in labchart or newer versions of AcqKnowledge.

Don't forget to open a test issue, we have a drop on coverage

Already did! It should be #230 . We also need to add documentation, I'm going to open an issue.

Thank you both for the quick reply! @eurunuela I guess we can merge and celebrate!

@eurunuela
Copy link
Collaborator

Go on @smoia , enjoy the merging! 😉

@smoia smoia merged commit 70e58a7 into physiopy:master Jun 22, 2020
@RayStick
Copy link
Member

RayStick commented Jun 22, 2020

If the question is: "is a channel frequency stable across the whole acquisition?", then I would say yes, but it's a good question. AFAIK it's not possible to change sampling rate midway through acquisition, but @sangfrois and @RayStick things might be different in labchart or newer versions of AcqKnowledge.

I don't think it is possible to change the sampling frequency midway through acquisition with LabChart, or if it is, I am not sure why you would want to. If we wanted to, we'd just stop recording and start a new recording. So I think you can assume that it will be the same sampling frequency across all runs within a single file.

@smoia smoia added the released This issue/pull request has been released. label Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Minormod This PR generally closes an `Enhancement` issue. It increments the minor version (0.+1.0) released This issue/pull request has been released.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chopping files semi-automatically
7 participants