-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] Better commenting for component selection algorithm in selcomps #91
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much, @handwerkerd, for going through this ! I've pointed out a few minor typos, but this is hugely helpful.
Do you want to leave this open as a [WIP] PR, if you're planning to come back to it ? Or do you want someone else to pick up from here ?
My one general question: there's some information in here that I think we should highlight in the readthedocs site, rather than only in the code base itself. WDYT about including those changes in this PR ?
tedana/selection/select_comps.py
Outdated
are part of the decision tree. Certain comments are prefaced with METRIC | ||
and variable names to make clear which are metrics and others are prefaced | ||
with SELECTION to make clear which are for applying metrics. METRICs tend | ||
to be summary values that containt a signal number per component. These |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "containt" --> "contain"
tedana/selection/select_comps.py
Outdated
""" | ||
|
||
""" | ||
There are sections of this code that calculate metrics that are used in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of moving these metrics into the sphinx documentation, rather than as code comments ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's the eventual goal, but it's a bit tricky when the code is only half-commented. Until there's full documentation of how each metric is or isn't used, making something seem definitive int he sphinx documentation could cause more harm than help. For example, there are some metrics which I thought would be used one way, but they were transformed into another metric & used in more component selection steps than I realized. In that case, saying Metric X is used in selection step Y would be wrong. My hope is, once the full process is better commented in the code, it will be relatively easy to summarize those comments in end-user documentation.
tedana/selection/select_comps.py
Outdated
""" | ||
countsigFS0 = seldict['F_S0_clmaps'].sum(0) | ||
countsigFR2 = seldict['F_R2_clmaps'].sum(0) | ||
countnoise = np.zeros(len(nc)) | ||
|
||
""" | ||
Make table of dice values | ||
METRICS: dice_tbl | ||
dice_FR2, dice_FS0 are calculated for each component and the concatinated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "concatinated" --> "concatenated"
tedana/selection/select_comps.py
Outdated
Vz: The z-scored the natural log of the non-normalized variance explained | ||
of each component | ||
Ktz: The z-scored natural log of the Kappa values | ||
(the '/ 2' doesnot seem necessary beacuse it will be removed by z-scoring) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "doesnot" --> "does not"
tedana/selection/select_comps.py
Outdated
# handwerkerd: This actually seems like a way to avoid using the theoretically | ||
# most liberal threshold only when there was a bad estimate and _mod is | ||
# is more inclusive. My one concern is that it's an odd way to test that | ||
# the _mod elbow is any better. WHy not at least see if _mod < _cons? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "WHy" --> "Why"
cond1 = getelbow_cons(seldict['Kappas']) > KRcut * 2 | ||
cond2 = getelbow_mod(seldict['Kappas'], val=True) < F01 | ||
if cond1 and cond2: | ||
Kcut = getelbow_mod(seldict['Kappas'], val=True) | ||
else: | ||
Kcut = getelbow_cons(seldict['Kappas'], val=True) | ||
# only use inclusive when exclusive is extremely exclusive - half KRcut |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to remove this comment ? It might be helpful to have the original explanation, even if we decide that it doesn't make sense (for us).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to expand the purpose of that comment since it's not clear how inclusive & exclusive are defined. I felt like the following blocks of comments added more detail and replaced this one. If you think the line I deleted contains unique and relevant info, I can add it back in.
https://github.com/handwerkerd/tedana/blob/4c65baf8873ebbc4965820101b006c2cc061c4b7/tedana/selection/select_comps.py#L399-L409
https://github.com/handwerkerd/tedana/blob/4c65baf8873ebbc4965820101b006c2cc061c4b7/tedana/selection/select_comps.py#L415-L419
https://github.com/handwerkerd/tedana/blob/4c65baf8873ebbc4965820101b006c2cc061c4b7/tedana/selection/select_comps.py#L426-L432
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about keeping it with something like "Originally written by PK:" at the start of the comment ? That way, if in finishing commenting this code we decide that it adds something helpful, we can work it back in -- and otherwise delete it. Does that seem reasonable, or do you think it's confusing to have even now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I also removed some comments on Step numbers since it was completely unclear to me why a bunch of stuff was or wasn't considered part of each step. I'll add those back in as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable to me, at least for the time being (where we're denoting them as "originally written by" or whatever you think best). Thanks again, @handwerkerd !
tedana/selection/select_comps.py
Outdated
rejB = ncl[utils.andb([tt_table[ncl, 0] < 0, | ||
seldict['varex'][ncl] > np.median(seldict['varex']), ncl > KRcut]) == 3] | ||
rej = np.union1d(rej, rejB) | ||
# adjust ncl again to only contain the remainng non-rejected components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "remainng" --> "remaining"
tedana/selection/select_comps.py
Outdated
ncl = np.setdiff1d(ncl, rej) | ||
|
||
# This is where handwerkerd has paused in hypercommenting the function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣
Also, I (finally) fixed the circle error. Would you be able to merge in the current master branch ? LMK if I can help in that ! |
I'd rather merge before the commenting is complete because there's no harm to adding it now and it would open up this commenting effort for others' contributions. I'd love if others continued this commenting, but I'll try to pick it up again next time I have a block of time to focus on this. I will correct the typos, do one more read-through for clarity, and add some detail to the opening docstring before merging the code. |
I just wanted to jump in to say how unbelievable this is, @handwerkerd ! 🎉 💯 Having tried to work through pieces of |
Codecov Report
@@ Coverage Diff @@
## master #91 +/- ##
=======================================
Coverage 27.95% 27.95%
=======================================
Files 25 25
Lines 1381 1381
=======================================
Hits 386 386
Misses 995 995
Continue to review full report at Codecov.
|
I've added info to the doctoring and done another round of proofreading. If no one sees any additional issues, it should be possible to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, @handwerkerd ! Merging.
Progress towards #84 .
Changes proposed in this pull request: