Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Allow for correlations with regressors in decision tree #1008

Closed
wants to merge 5 commits into from

Conversation

eurunuela
Copy link
Collaborator

@eurunuela eurunuela commented Nov 24, 2023

Closes #1009.

Changes proposed in this pull request:

  • Adds the dec_correlation_higherthan_thresholds() function to selection_nodes.py, which allows the user to provide a tsv file with regressors in its column to consider during component classification. The function also expects the number of metric labels (e.g., ["visual task", "motor task"]) to match with the number of regressors (i.e., the number of columns in the regressors file).

Copy link

codecov bot commented Nov 24, 2023

Codecov Report

Attention: Patch coverage is 67.44186% with 14 lines in your changes missing coverage. Please review.

Project coverage is 89.25%. Comparing base (1c3f93e) to head (4f286e1).
Report is 63 commits behind head on main.

Files with missing lines Patch % Lines
tedana/selection/selection_nodes.py 63.15% 8 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1008      +/-   ##
==========================================
- Coverage   89.54%   89.25%   -0.29%     
==========================================
  Files          26       26              
  Lines        3395     3434      +39     
  Branches      619      628       +9     
==========================================
+ Hits         3040     3065      +25     
- Misses        207      215       +8     
- Partials      148      154       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@handwerkerd
Copy link
Member

Thank you for starting to work on this and sharing so that it's possible to start discussing.

I do have one concern that I wanted to bring up sooner rather than later. The way you're approaching this now, you're effectively calculating a metric per component as part of a selection process. The advantage of this is that decision tree metrics can be tweaked using just the decision tree code, but the disadvantage is, if we start calculating metrics in the decision tree, then it's harder to understand what's happening using the decision tree outputs. That is, full columns would be added to the component table and nodes in the decision tree both make calculations and decisions. If I'm reading the draft code correctly, your not saving the fit metrics in the component table & that's important info to save.

My preference would be to add this code as a metric and then most of the decisions could be made just using dec_left_op_right. That would make it hard to play around with regressor fit metrics on the same components, but there are ways to address that issue. We can talk more about this later, but wanted to bring this up before you do a lot more work on this.

@handwerkerd
Copy link
Member

Also @goodalse2019 and @n-reddy both mentioned some interest in working on this, so I want to make sure they both saw you've started.

@eurunuela
Copy link
Collaborator Author

That's a fair point @handwerkerd. I actually thought of using dec_left_op_right for this, but ended up building this other approach. It seemed to me the approach on this PR would be the easiest/quickest way to have a working version of it ready.

In any case, happy to discuss how we can adjust the code to use dec_left_op_right.

@tsalo
Copy link
Member

tsalo commented Jan 9, 2024

I like the idea of creating a new metrics/external.py module with functions to correlate the component time series with the external ones. Then the metrics could be evaluated with dec_left_op_right in the decision tree.

@handwerkerd
Copy link
Member

@eurunuela if you think the approach @tsalo and I are working on in #1021 is the one we'll move forward with, do you think this PR should be closed?

@eurunuela
Copy link
Collaborator Author

Absolutely. Let's close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to classify components based on their correlation with regressors
3 participants