Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add support for experimental design aggregation #724

Draft
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

leej3
Copy link
Collaborator

@leej3 leej3 commented May 13, 2021

@effigies, @tyarkoni

Overall this looks like a nice way to support downstream users of the stats-model spec. @Shotgunosine and myself have implemented a simple command line tool that, for a given bids run (I likely mean something more precise like combination of factor levels at the first-level modeling), writes out the appropriate transformed variables (dense and sparse) of the design matrix to tsv files as well as the fully assembled design matrix (currently using the pybids HRF for convolution). In summary, this a CLI that wraps the transformations logic of pybids; pybids will continue to directly access said logic using python to reduce file IO.

$ statsmodels-design-synthesizer -h
usage: statsmodels-design-synthesizer [-h] --events-tsv EVENTS_TSV --transforms TRANSFORMS [--output-sampling-rate OUTPUT_SAMPLING_RATE] --output-dir OUTPUT_DIR --nvol NVOL --tr TR --ta TA

optional arguments:
  -h, --help            show this help message and exit
  --events-tsv EVENTS_TSV
                        Path to events TSV
  --transforms TRANSFORMS
                        Path to transform or model json
  --output-sampling-rate OUTPUT_SAMPLING_RATE
                        Output sampling rate in Hz when a full design matrix is desired.
  --output-dir OUTPUT_DIR
                        Path to directory to write processed event files.

Specify some essential details about the time series.:
  --nvol NVOL           Number of volumes in func time-series
  --tr TR               TR for func time series
  --ta TA               TA for events

Our external prototype (which extracted the time series transformation logic of pybids) revealed some dependence on bids variables/variable collections that makes that option impossible for now (it would result in cyclic import dependencies). If it is reasonable to pull the above classes along with the transformations modules, a hypothetical standalone package could be called bids-timeseries or something like that and provide a way of doing transformations with the various experimental time-series without the bids entities/metadata collection logic of pybids. Such encapsulation could conceivably be valuable. This would be a longer term goal and external contributors would likely be needed to drop the dependencies of such a package though (numpy and pandas currently making the required time-series operations a lot easier than using the python standard library... resampling time series, convolutions etc would be unpleasant to write from scratch). @afni-rickr thoughts/burning desire to contribute said lower level implementation welcome.

Myself and @Shotgunosine are going to make a final push on this today. Any feedback would be much appreciated. Here's the current TODO list.

Feedback would be most valuable here:

  • Decide upon a name. Suggestions here
  • Confirm args for CLI... @effigies felt TR/TA would be best for supporting future functionality. Also passing nvol explicitly rather than parsing it from the image will drop nibabel as a future requirement. Overall stable CLI should be prioritized over parsimony/convenience for interactive use.
  • Provide some example invocations where third party tools are used for convolution with a HRF (might be better implemented as a test within fitlins)
  • Consider adding to_afni/to_fsl methods etc to the bids variable collection class instead of dumping the sparse/variables to tsvs. For now I think the tsvs are a reasonable intermediate for the downstream implementers to target. Sparse/dense tsv combination can be used if package specific convolution with HRFs etc are desired, otherwise the fully constructed design matrix can be mapped to the appropriate downstream format e.g. 3dremlfit compatible design matrix
  • We initially used a named tuple for run info during collection construction. This effectively drives two logical flows through the bids variable loading (in io.py, see here). It seems the RunNode class should be left in pybids in which case we would need to use the collections code path... it would be nice to implemented this in a less hacky way...
  • Implement validation of the transforms json. I read somewhere there was a json schema somewhere already... not sure if the spec has changed for transforms as it was removed from statsmodels though. Alternatively we could implement something with pydantic.

Some other implementation details that we can handle:

  • Confirm/add support for dense/sparse blends pre/post transforms (see here)
  • Add support for parsing a transform json instead of just a model json.
  • Switch to click for CLI
  • Rebase onto statsmodels branch until it is merged

@leej3 leej3 requested review from effigies and tyarkoni May 13, 2021 09:46
leej3 and others added 3 commits May 13, 2021 14:00
transformations were previously not applied unless they were defined in
the default transformations
@leej3 leej3 changed the title Add support for experimental design aggregation [WIP] Add support for experimental design aggregation May 13, 2021
@leej3
Copy link
Collaborator Author

leej3 commented May 13, 2021

We had another think about names for the tool (or pybids subcommand if I can figure our a nice way to add it without becoming too entwined... but we can discuss

make-regressors seems like a nice choice. It can create a design matrix but it also might just output a set of sparse time-series to tsv files. Similarly for input "events" doesn't really capture it all (metadata files and other time-series in derivatives directories will likely be used)

Some other ideas that were thrown out though...

  • events2design
  • events2regressors
  • bids-timeseries
  • bidstime
  • To Inputs for Models from Events (TIME)

@effigies
Copy link
Collaborator

bids morphingtime

@effigies
Copy link
Collaborator

This is to say: Thanks for this! I haven't been able to review it today. I'll have a look tomorrow before the wrap-up.

@leej3
Copy link
Collaborator Author

leej3 commented May 13, 2021

bids morphingtime

Ooh I love this.

I’ll have a look tomorrow

Sounds good. Thanks

@leej3
Copy link
Collaborator Author

leej3 commented May 14, 2021

Is there a draft of a transformations spec somewhere? I suspect my json parsing is a little off...

This is what we will currently support:

@pytest.mark.parametrize(

My biggest concern is that assuming that the first Node of a model json contains the Transformations is likely incorrect.

@effigies
Copy link
Collaborator

Is there a draft of a transformations spec somewhere?

Not yet... Feel free to fork https://docs.google.com/document/d/1F_REhtYTWHBgASV-aoZh9GSV7ZhP0Deg4os4EP7MpM8/edit#heading=h.mqkmyp254xh6 into something. I can work on it this afternoon.

@tyarkoni
Copy link
Collaborator

Here, I dumped just the transformations into a separate document:

https://docs.google.com/document/d/1uxN6vPWbC7ciAx2XWtT5Y-lBrdckZKpPdNUNpwRxHoU/edit?usp=sharing

@tyarkoni
Copy link
Collaborator

Note that not all of the transformations in that document are currently implemented in PyBIDS—feel free to remove them from the doc (or, alternatively, to implement them in PyBIDS).

@leej3
Copy link
Collaborator Author

leej3 commented May 14, 2021

Thanks, I'll have a look through.

Base automatically changed from statsmodels to master May 21, 2021 16:21
Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been sitting on these comments too long. I just haven't finished looking through. Overall what I've seen so far makes sense.

bids/statsmodels_design_synthesizer.py Outdated Show resolved Hide resolved
setup.cfg Outdated
@@ -65,6 +65,7 @@ dev =
[options.entry_points]
console_scripts =
pybids=bids.cli:cli
statsmodels-design-synthesizer=bids.statsmodels_design_synthesizer:main
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All CLIs should be moved under bids.cli. This could be bids.cli.statsmodels_design_synthesizer:main or become a subcommand of pybids.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've moved this to cli now. I think it should be a subcommand. I don't know my way around click well enough to add this quickly.
Moving the cli test from test_morphing_time will require duplicating some of the data setup code, moving said code to a fixture in conftest.py, or mocking the call to morphing_time (tricky because it needs to be mocked on the other side of a system call, run.py could possibly help but might be too complicated).

bids/variables/io.py Outdated Show resolved Hide resolved
Comment on lines +91 to +106
"GroupBy": [
"subject",
"contrast"
]
},
{
"Source": "participant",
"Destination": "by-group",
"GroupBy": [
"sex"
]
},
{
"Source": "participant",
"Destination": "group-diff",
"GroupBy": []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GroupBy should now be in the nodes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shotgunosine could you look at this one.

john lee and others added 5 commits May 29, 2021 11:39
Correct management of intercept in model file containing convolution..
Make t-test "Test" not type.

Co-authored-by: Chris Markiewicz <markiewicz@stanford.edu>
Ideally morphing-time should be a subcommand.
Moving morphing-time cli test to test_cli requires a little thinking.
@leej3
Copy link
Collaborator Author

leej3 commented May 29, 2021

I've been sitting on these comments too long. I just haven't finished looking through. Overall what I've seen so far makes sense.

no worries. This is more downstream of the spec and flow stuff then I realized. Thanks for taking a look.

@adelavega adelavega marked this pull request as draft September 19, 2022 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants