Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be aware of numerical stability issues introduced by changes to MathUtils methods, e.g. log10factorial. #7649

Closed
samuelklee opened this issue Jan 27, 2022 · 1 comment

Comments

@samuelklee
Copy link
Contributor

samuelklee commented Jan 27, 2022

Not really an issue, just wanted to document some surprising behavior.

@tmelman has been reviving/reimplementing some ancient CNV/ModelSegments evaluations (dating as far back as 4.0.2.1!) and trying to understand whether observed differences---intentional or otherwise---are due to method changes I might have made, or if she might've introduced changes in her reimplementation of the evaluation code.

I ran some checks on the stability of ModelSegments using an old set of inputs (normal/tumor allelic counts and denoised copy ratios for SM-74P4M WES). Behavior has remained largely stable since at least 4.1.0.0. Namely:

  1. We evaluated and signed off on a change that went into 4.1.0.0. See comments in Added MinibatchSliceSampler and replaced naive subsampling in ModelSegments. #5575.
  2. A slight numerical difference in the MCMC-sampled allele fractions was introduced by changes made to some MathUtils code for calculating logs/factorials/etc. between 4.1.0.0 and 4.1.1.0 in Rewrote M2 isActive method as special case of somatic likelihoods model #5814. Note that no CNV code was directly changed, it's just that we call out to that changed MathUtils code---namely, to calculate log10factorial. The overall result in my test was a very slight change to the number of segments found, from 516 to 522.
  3. No further numerical changes have been introduced through the current 4.2.4.1, so any additional code changes I made were indeed true refactors, at least from the perspective of this simple test. Phew!

I was indeed surprised to find that very slight differences in the log10factorial behavior (which result from changing the recursive calculation of cached values to a direct one, and appear in something like the 13th decimal place) led to non-negligible changes in the MCMC estimates of the allele fractions---and thus, changes in the number of segments. Although these are also relatively slight differences in terms of practical impact, they are perhaps much larger than one might guess, given their humble origins.

@samuelklee
Copy link
Contributor Author

samuelklee commented Jan 27, 2022

Actually, I'm going to go ahead and add some exact match tests to guard against this sort of thing. Behavior for key somatic CNV tool modules (i.e., kernel segmentation and MCMC) is unit tested to within statistical noise (so, not exact match) on simulated data, but most integration tests just check for plumbing and not correctness.

The idea was always that this sort of thing would be covered by what eventually became CARROT, since such tests would probably have to be long running and require more resources than are available in the repo to be useful. See the high priority but long dormant issues #4122 and #4123, as well as #4630. In fact, I think the original idea was that Lee's validation would be the first to go into CARROT.

Note also that I did some work to set up transition of all existing CNV tests (also including the somatic CNV validation against TCGA SNP calls that I put together on Terra) before going on leave and moving off CNVs, but during all that, we managed to 1) lose TCGA access, 2) delete the test files on which Lee based his validation after he left, and 3) reassign at least one of the people that was going to help with the transition.

Again, the resulting differences here are minor and it's unlikely that future non-CNV code changes will have similar effects, since the CNV code is relatively well encapsulated, but the exact-match checks will hopefully give us some peace of mind until CARROT tests are ready.

@jonn-smith @KevinCLydon looping you in just in case you're not aware of all of this history. Would love to chat about where CARROT is at and where you'd like it to go---feel free to ping me anytime!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant