Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to recalculate the allele fraction based on AD instead of … #5118

Merged
merged 2 commits into from
Aug 17, 2018

Conversation

ldgauthier
Copy link
Contributor

…the Bayesian estimate

For @meganshand until @davidbenjamin addresses #5096 in a more rigorous way

Copy link
Contributor

@davidbenjamin davidbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.


/**
* Created by gauthier on 8/15/18.
* Bug Laura to fill this in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to do something about this.

Utils.nonNull(vc, "vc is null");

final GenotypesContext genotypes = vc.getGenotypes();
if ( g == null || !g.isCalled() || g.hasExtendedAttribute(getKeyNames().get(0))) { //don't overwrite AF based on Bayesian estimate if it already exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be clear in the class javadoc that this annotation does nothing if AF is already present.

else if (likelihoods != null) {
DepthPerAlleleBySample adCalc = new DepthPerAlleleBySample();
final int[] AD = adCalc.annotateWithLikelihoods(vc, g, new LinkedHashSet<>(vc.getAlleles()), likelihoods);
final double[] allAlleleFractions = MathUtils.normalizeFromRealSpace(Arrays.stream(AD).mapToDouble(x -> x*1.0).toArray());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x -> x will work here without the *1.0

@@ -235,10 +235,13 @@ private void addGenotypes(final LikelihoodMatrix<Allele> tumorLog10Matrix,
final Optional<LikelihoodMatrix<Allele>> normalLog10Matrix,
final VariantContextBuilder callVcb) {
final double[] tumorAlleleCounts = getEffectiveCounts(tumorLog10Matrix);
final Genotype tumorGenotype = new GenotypeBuilder(tumorSample, tumorLog10Matrix.alleles())
.AD(Arrays.stream(tumorAlleleCounts).mapToInt(x -> (int) FastMath.round(x)).toArray())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is AD still getting added to the genotype in the new code below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the original code wrote out AD based on the Bayesian estimate, then overwrote it during annotation because DepthPerAlleleBySample is a StandardMutectAnnotation. So in theory, there didn't used to be a way to turn it off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that the output without the arg does still have AD and added coverage in the tumor normal test.

@ldgauthier ldgauthier force-pushed the ldg_M2getAFfromAD branch 2 times, most recently from 66c2f20 to d560cb5 Compare August 16, 2018 15:24
@codecov-io
Copy link

codecov-io commented Aug 16, 2018

Codecov Report

Merging #5118 into master will increase coverage by 0.07%.
The diff coverage is 94.118%.

@@              Coverage Diff               @@
##              master     #5118      +/-   ##
==============================================
+ Coverage     86.507%   86.577%   +0.07%     
- Complexity     29260     29716     +456     
==============================================
  Files           1814      1826      +12     
  Lines         135635    137753    +2118     
  Branches       15063     15313     +250     
==============================================
+ Hits          117334    119262    +1928     
- Misses         12833     13019     +186     
- Partials        5468      5472       +4
Impacted Files Coverage Δ Complexity Δ
...ools/walkers/annotator/DepthPerAlleleBySample.java 97.727% <ø> (ø) 21 <0> (ø) ⬇️
...der/tools/walkers/mutect/M2ArgumentCollection.java 94.444% <100%> (+0.327%) 3 <0> (ø) ⬇️
.../tools/walkers/mutect/SomaticGenotypingEngine.java 90.058% <100%> (+0.118%) 70 <2> (+2) ⬆️
...r/tools/walkers/mutect/Mutect2IntegrationTest.java 90.217% <100%> (-0.949%) 72 <5> (+16)
...bender/tools/walkers/annotator/AlleleFraction.java 85.714% <85.714%> (ø) 8 <8> (?)
...ngine/spark/AddContextDataToReadSparkUnitTest.java 92.453% <0%> (-2.142%) 7% <0%> (+1%)
...rkduplicates/MarkDuplicatesSparkUtilsUnitTest.java 95.238% <0%> (-1.314%) 22% <0%> (+11%)
...bender/tools/walkers/qc/PileupIntegrationTest.java 89.189% <0%> (-1.287%) 11% <0%> (+5%)
.../walkers/rnaseq/ASEReadCounterIntegrationTest.java 93.939% <0%> (-0.427%) 27% <0%> (+13%)
... and 50 more

Copy link
Contributor

@davidbenjamin davidbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ldgauthier ldgauthier merged commit 17bd259 into master Aug 17, 2018
@ldgauthier ldgauthier deleted the ldg_M2getAFfromAD branch August 17, 2018 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants