Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify HaplotypeBAMWriter code. #944 #5122

Merged
merged 9 commits into from
Aug 30, 2018
Merged

Simplify HaplotypeBAMWriter code. #944 #5122

merged 9 commits into from
Aug 30, 2018

Conversation

kvinter1
Copy link
Contributor

This is in response to ticket #944

  1. Merged subclass, SAMFileDestination, and superclass, HaplotypeBAMDestination.

  2. Merged both subclasses, AllHaplotypeBAMWriter and CalledHaplotypeBAMWriter, with their superclass, HaplotypeBAMWriter.

@kvinter1 kvinter1 requested a review from cmnbroad August 17, 2018 19:47
@codecov-io
Copy link

codecov-io commented Aug 17, 2018

Codecov Report

Merging #5122 into master will increase coverage by 0.029%.
The diff coverage is 100%.

@@              Coverage Diff               @@
##             master     #5122       +/-   ##
==============================================
+ Coverage     86.66%   86.689%   +0.029%     
- Complexity    29043     31076     +2033     
==============================================
  Files          1808      1835       +27     
  Lines        134662    142271     +7609     
  Branches      14935     16228     +1293     
==============================================
+ Hits         116698    123333     +6635     
- Misses        12552     13373      +821     
- Partials       5412      5565      +153
Impacted Files Coverage Δ Complexity Δ
...ender/utils/haplotype/HaplotypeBAMDestination.java 100% <ø> (ø) 4 <0> (ø) ⬇️
...er/utils/haplotype/HaplotypeBAMWriterUnitTest.java 89.076% <100%> (-0.355%) 23 <0> (-1)
...hellbender/utils/haplotype/HaplotypeBAMWriter.java 98.305% <100%> (+0.228%) 16 <8> (+6) ⬆️
...kers/haplotypecaller/AssemblyBasedCallerUtils.java 70.297% <100%> (ø) 25 <0> (ø) ⬇️
...der/tools/spark/sv/discovery/SvDiscoveryUtils.java 36.082% <0%> (-8.654%) 14% <0%> (+1%)
...rg/broadinstitute/hellbender/utils/Nucleotide.java 95.266% <0%> (-4.734%) 59% <0%> (+52%)
...ats/collections/SimpleCountCollectionUnitTest.java 83.784% <0%> (-2.883%) 4% <0%> (ø)
...ngine/spark/AddContextDataToReadSparkUnitTest.java 92.453% <0%> (-2.142%) 12% <0%> (+6%)
...ils/read/markduplicates/sparkrecords/Fragment.java 90.909% <0%> (-1.948%) 12% <0%> (+6%)
...r/utils/read/markduplicates/sparkrecords/Pair.java 98.276% <0%> (-1.724%) 51% <0%> (+25%)
... and 136 more

final String haplotypeReadGroupID)
{
Utils.nonNull(outPath, "outputPath cannot be null");
Utils.nonNull(sourceHeader, "sourceHeader cannot be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're checking sourceHeader twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I will remove the second instance of this.

Assert.assertTrue(HaplotypeBAMWriter.create(HaplotypeBAMWriter.WriterType.CALLED_HAPLOTYPES, writer) instanceof CalledHaplotypeBAMWriter);
Assert.assertTrue(HaplotypeBAMWriter.create(HaplotypeBAMWriter.WriterType.ALL_POSSIBLE_HAPLOTYPES, writer) instanceof AllHaplotypeBAMWriter);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the content of this test get moved to another test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. This test made sure the two subclasses of HaplotypeBAMWriter (AllHaplotypeBAMWriter and CalledHaplotypeBAMWriter) were indeed subclasses of their superclass. Since these were deleted, there is no more need for the test.

Deletion of redundant check of sourceReader being nonNull. (same as Line 45)
Copy link
Contributor

@meganshand meganshand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, looks good to me, but I'd appreciate another set of eyes. Could you take a look at this when you get a chance @cmnbroad?

public void add(final GATKRead read){
Utils.nonNull(read, "read cannot be null");
samWriter.addRead(read);
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some extraneous semicolons at the ends of method bodies (here and below). These are not needed now that the methods are no longer abstract.

@cmnbroad
Copy link
Collaborator

cmnbroad commented Aug 27, 2018

@kvinter1 Back when #944 was originally made, the class hierarchy was overkill since there was only one implementation, but now we have HaplotypeCallerSpark which will need distributed (RDD-based) implementation. So I think we should retain the HaplotypeBAMDestination hierarchy since we'll need it then. Can you revert that part of this (but keep the HaplotypeBAMWriter consolidation part ?). Thanks.

…istence of SAMFileDestination as its subclass. Kept the consolidation of HaplotypeBAMWriter and its subclasses (AllHaplotypeBAMWriter and CalledHaplotypeBAMWriter).
@kvinter1
Copy link
Contributor Author

@cmnbroad Thanks for the comments. I reverted back to the original, abstract HaplotypeBAMDestination and included its subclass SAMFileDestination. All other changes were kept. Let me know if you find any other issues.

Copy link
Collaborator

@cmnbroad cmnbroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kvinter1 Thanks for reverting those changes. I reviewed the remaining code and just have a couple of cleanup comments. thx.

@@ -33,6 +33,7 @@
private static final int otherMQ = 0;

protected final HaplotypeBAMDestination output;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output can be made private now

@@ -125,25 +103,48 @@ public void close() {

/**
* Write out a BAM representing for the haplotype caller at this site
* writerType (ALL_POSSIBLE_HAPLOTYPES or CALLED_HAPLOTYPES) determines inputs to writeHaplotypesAsReads
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets fix up the javadoc here (the first sentence was incomplete before):

Write out a BAM representing the haplotypes at this site, based on the value for writerType used when the writer was constructed (ALL_POSSIBLE_HAPLOTYPES or CALLED_HAPLOTYPES).

Collection<Haplotype> haplotypesToWrite = writerType.equals(WriterType.CALLED_HAPLOTYPES) ? calledHaplotypes : haplotypes;
Set<Haplotype> bestHaplotypesToWrite = writerType.equals(WriterType.CALLED_HAPLOTYPES) ? calledHaplotypes : new LinkedHashSet<>(bestHaplotypes);

writeHaplotypesAsReads(haplotypesToWrite, bestHaplotypesToWrite, paddedReferenceLoc);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be much clearer to just have an if test after the Utils checks, with one branch for ALL and one for called CALLED.

Utils.nonNull(haplotypes, "haplotypes cannot be null");
Utils.nonNull(paddedReferenceLoc, "paddedReferenceLoc cannot be null");
Utils.nonNull(calledHaplotypes, "calledHaplotypes cannot be null");
Utils.nonNull(readLikelihoods, "readLikelihoods cannot be null");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a null check for bestHaplotypes as well. Also, since we're now requiring calledHaplotypes to always be non-null (I think it used to depend on the writer type), lets update the javadoc for all of these args to say they must be non-null.

Copy link
Collaborator

@cmnbroad cmnbroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kvinter1!

@cmnbroad cmnbroad merged commit da950e2 into master Aug 30, 2018
@cmnbroad cmnbroad deleted the kpv_#944 branch August 30, 2018 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants