Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Do not use minimum mask for OC data in tedpca #204

Merged
merged 14 commits into from
Feb 1, 2019

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Jan 27, 2019

Closes #168.

Changes proposed in this pull request:

  • Use inputted mask for masking optimally combined data in TEDPCA (as is done for concatenated data [--sourceTEs = 0] or indexed echoes [--sourceTEs = list]) instead of minimum mask.

@codecov
Copy link

codecov bot commented Jan 27, 2019

Codecov Report

Merging #204 into master will decrease coverage by 0.02%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #204      +/-   ##
==========================================
- Coverage   51.42%   51.39%   -0.03%     
==========================================
  Files          32       32              
  Lines        1964     1965       +1     
==========================================
  Hits         1010     1010              
- Misses        954      955       +1
Impacted Files Coverage Δ
tedana/decomposition/eigendecomp.py 10.46% <0%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d23934...db78c25. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 27, 2019

Codecov Report

Merging #204 into master will not change coverage.
The diff coverage is 0%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #204   +/-   ##
=======================================
  Coverage   51.42%   51.42%           
=======================================
  Files          32       32           
  Lines        1964     1964           
=======================================
  Hits         1010     1010           
  Misses        954      954
Impacted Files Coverage Δ
tedana/decomposition/eigendecomp.py 10.52% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d23934...f5642fb. Read the comment docs.

@@ -266,13 +266,13 @@ def tedpca(catd, OCcatd, combmode, mask, t2s, t2sG,

if len(ste) == 1 and ste[0] == -1:
LGR.info('Computing PCA of optimally combined multi-echo data')
d = OCcatd[utils.make_min_mask(OCcatd[:, np.newaxis, :])][:, np.newaxis, :]
d = OCcatd[mask, :][:, np.newaxis, :]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more question: for the two other options, we explicitly cast to float64. Are we sure OCcatd is float64 already ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anywhere where the type for the optimally combined data is explicitly set. Honestly, I think casting to float64 is unnecessary for any of the data, but we can do it to the optimally combined data too just to be consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with either path forward, I'm just pro-consistency unless otherwise specified 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can open this as another issue, because I'm not sure how much it overlaps here, but I'm actually pretty sure your initial take (that casting to float64 is unnecessary) is correct. My reason for bringing it up now is because it's an easy chance to change it and I do think we're increasing memory usage with the re-casting....

@emdupre
Copy link
Member

emdupre commented Jan 31, 2019

@jbteves or @dowdlelt: if you run this branch on the data you mentioned in in #168 (comment) with --sourceTEs=0 do you get equivalent results ? Trying to confirm that dropping float64 casting won't introduce other concerns !

@dowdlelt
Copy link
Collaborator

@emdupre kicking off a test run now. Hope to have an answer soon.

@jbteves
Copy link
Collaborator

jbteves commented Jan 31, 2019

@dowdlelt are you running this on our shared server? If so I don't see a point in replicating efforts, if not I'll run a test on it so we have at least two OS's.

@dowdlelt
Copy link
Collaborator

dowdlelt commented Feb 1, 2019

negative @jbteves and I managed to lock my local machine because I got greedy with resources. Spin it up on the shared server, and I'll try another run tomorrow am.

@jbteves
Copy link
Collaborator

jbteves commented Feb 1, 2019

Okay, @emdupre @tsalo here is how I tested:

  1. Existing directory contained results with the code and --sourceTEs 0, supplied "user"-generated mask (from FSL BET) pre-fix
  2. Cloned @tsalo's branch into my local repo, which the environment was pointed at.
  3. Re-ran on the same NifTI files (including original mask)
  4. Reshaped the matrices to single vectors and subtracted

From this I got all 0's in the resulting subtraction, which indicates the results should be the same, for the two runs that have already finished (still waiting on several). Let me know if you want additional tests done-- I can try to knock them out. If you catch me in the next half hour I'll run them overnight.

@emdupre
Copy link
Member

emdupre commented Feb 1, 2019

That's reassuring ! Just to make sure I understand: the matrix you re-shaped was which output file ?

Thanks so much for doing that !! ✨

@jbteves
Copy link
Collaborator

jbteves commented Feb 1, 2019

dn_ts_OC.nii

@emdupre
Copy link
Member

emdupre commented Feb 1, 2019

Great ! The only other way I can think to test it would be to directly compare the files, with something like

import numpy as np
import nibabel as nb

file1 = nb.load('filename1.nii').get_data()
file2 = nb.load('filename2.nii').get_data()

np.allclose(file1, file2)

But that should give you just a True instead of a 0.

Unless @tsalo has any other concerns I'll go ahead and get this merged tomorrow !

@tsalo
Copy link
Member Author

tsalo commented Feb 1, 2019

Sounds good to me. I also think that, since this fix has the potential to impact a fair amount of data, we should probably draft a minor release after it's merged. Well this PR and the other things we've managed to do since the last release (e.g., tedort, the tedpca split, changes to the CLI, and vast improvements to the documentation).

@emdupre
Copy link
Member

emdupre commented Feb 1, 2019

OK, I'll go ahead and get this merged ! Thanks to @tsalo for the fix, and to @dowdlelt and @jbteves for the test on real data !

@emdupre emdupre merged commit 741ca71 into ME-ICA:master Feb 1, 2019
@tsalo tsalo deleted the eimask branch February 17, 2019 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change WIll make a non-trivial change to outputs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in tedpca with minimally preprocessed data from ds001491
4 participants