Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #5910: Extract engine from GenotypeGVCFs #6004

Merged
merged 1 commit into from
Aug 5, 2019

Conversation

jonathoncohen98
Copy link
Contributor

Extracted GenotypeGVCFs engine from GenotypeGVCFs

@jonathoncohen98 jonathoncohen98 changed the title Issue #5910 Issue #5910: Extract engine from GenotypeGVCFs Jun 13, 2019
@codecov
Copy link

codecov bot commented Jun 13, 2019

Codecov Report

Merging #6004 into master will decrease coverage by 7.92%.
The diff coverage is 18.142%.

@@              Coverage Diff               @@
##              master     #6004      +/-   ##
==============================================
- Coverage     80.142%   72.222%   -7.92%     
+ Complexity     31040     27019    -4021     
==============================================
  Files           2014      2017       +3     
  Lines         151333    151440     +107     
  Branches       16612     16623      +11     
==============================================
- Hits          121281    109373   -11908     
- Misses         24197     36760   +12563     
+ Partials        5855      5307     -548
Impacted Files Coverage Δ Complexity Δ
...titute/hellbender/tools/walkers/GenotypeGVCFs.java 0% <0%> (-90.871%) 0 <0> (-102)
...ellbender/tools/walkers/GenotypeGVCFsUnitTest.java 100% <100%> (ø) 21 <3> (ø) ⬇️
.../hellbender/tools/walkers/GenotypeGVCFsEngine.java 15.094% <15.094%> (ø) 17 <17> (?)
...nder/tools/copynumber/utils/TagGermlineEvents.java 0% <0%> (-100%) 0% <0%> (-3%)
...r/tools/spark/pathseq/PSBwaArgumentCollection.java 0% <0%> (-100%) 0% <0%> (-1%)
...ender/tools/readersplitters/ReadGroupSplitter.java 0% <0%> (-100%) 0% <0%> (-3%)
...adinstitute/hellbender/engine/ReferenceWalker.java 0% <0%> (-100%) 0% <0%> (-9%)
...ools/funcotator/filtrationRules/ClinVarFilter.java 0% <0%> (-100%) 0% <0%> (-7%)
...ls/walkers/varianteval/stratifications/Sample.java 0% <0%> (-100%) 0% <0%> (-4%)
...nes/metrics/QualityYieldMetricsCollectorSpark.java 0% <0%> (-100%) 0% <0%> (-7%)
... and 475 more

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I would say that you should clean up the code a little bit so its not full of commented out methods/notes etc. I would also say you should write some javadoc comments for the methods in the new class that don't have them.

@@ -98,8 +93,11 @@
public static final String ALL_SITES_SHORT_NAME = "all-sites";
public static final String KEEP_COMBINED_LONG_NAME = "keep-combined-raw-annotations";
public static final String KEEP_COMBINED_SHORT_NAME = "keep-combined";
private static final String GVCF_BLOCK = "GVCFBlock";
private VCFHeader outputHeader;
//move to engine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to clean up these notes to self

if (somaticInput) {
logger.warn("Note that the Mutect2 reference confidence mode is in BETA -- the likelihoods model and output format are subject to change in subsequent versions.");
}

//leave
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also get rid of the leave comments

import htsjdk.variant.vcf.VCFHeaderLine;
import htsjdk.variant.vcf.VCFStandardHeaderLines;

public class GenotypeGVCFsEngine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add a javadoc comment here explaining what the engine does and generally how it should be used



//constructor
public GenotypeGVCFsEngine(VariantAnnotatorEngine annotationEngine, GenotypeCalculationArgumentCollection genotypeArgs, boolean includeNonVariants)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a propper javadoc explaining ~ what it does.

}

//methods
public VariantContextWriter initialize(VCFHeader inputVCFHeader, Set<VCFHeaderLine> defaultToolVCFHeaderLines, boolean keepCombined, DbsnpArgumentCollection dbsnp, VariantContextWriter vcfWriter) //do work for onTraversalStart
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this method.

* @param vc the variant context
* @return variant context with the NON-REF alleles removed if multiallelic or replaced with NO-CALL alleles if biallelic
*/
//move to engine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More //move comments that could go

}

/*
//dont move
private static int parseInt(Object attribute){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this from the main class.

intervals = hasUserSuppliedIntervals() ? intervalArgumentCollection.getIntervals(getBestAvailableSequenceDictionary()) :
Collections.emptyList();

final SampleList samples = new IndexedSampleList(inputVCFHeader.getGenotypeSamples()); //todo should this be getSampleNamesInOrder?
//move
//final SampleList samples = new IndexedSampleList(inputVCFHeader.getGenotypeSamples()); //todo should this be getSampleNamesInOrder?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the commented out code


//setupVCFWriter(inputVCFHeader, samples);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, just delete the code lines you commmented out.

//could leave error checking here, or put error checking in engine - if not null, return VariantContext. if null, return null.

final VariantContext regenotypedVC = gvcfEngine.callRegion(intervals, loc, variants, ref, features, merger, somaticInput, tlodThreshold, afTolerance);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think leaving the below logic in the main tool was probably the right call.

@jonathoncohen98 jonathoncohen98 self-assigned this Jun 14, 2019
Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick comment about the javadocs and this should be good to go.

@@ -25,15 +25,24 @@
import java.util.*;
import java.util.stream.Collectors;

/**
* the core engine for the GenotypeGVCF that does all of the actual work of the tool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core engine for the GenotypeGVCF that does all of the actual work of the tool -> 'Engine class to allow for other classes to replicate the behavior of GenotypeGVCFs. See {@link GenotypeGVCFs} for details.

private VariantContextWriter vcfWriter;

/** these are used when {@link #onlyOutputCallsStartingInIntervals) is true */
//pass in to callRegion() in engine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spurious note

@codecov
Copy link

codecov bot commented Jun 17, 2019

Codecov Report

Merging #6004 into master will decrease coverage by 7.92%.
The diff coverage is 18.142%.

@@              Coverage Diff               @@
##              master     #6004      +/-   ##
==============================================
- Coverage     80.142%   72.222%   -7.92%     
+ Complexity     31040     27019    -4021     
==============================================
  Files           2014      2017       +3     
  Lines         151333    151440     +107     
  Branches       16612     16623      +11     
==============================================
- Hits          121281    109373   -11908     
- Misses         24197     36760   +12563     
+ Partials        5855      5307     -548
Impacted Files Coverage Δ Complexity Δ
...titute/hellbender/tools/walkers/GenotypeGVCFs.java 0% <0%> (-90.871%) 0 <0> (-102)
...ellbender/tools/walkers/GenotypeGVCFsUnitTest.java 100% <100%> (ø) 21 <3> (ø) ⬇️
.../hellbender/tools/walkers/GenotypeGVCFsEngine.java 15.094% <15.094%> (ø) 17 <17> (?)
...nder/tools/copynumber/utils/TagGermlineEvents.java 0% <0%> (-100%) 0% <0%> (-3%)
...r/tools/spark/pathseq/PSBwaArgumentCollection.java 0% <0%> (-100%) 0% <0%> (-1%)
...ender/tools/readersplitters/ReadGroupSplitter.java 0% <0%> (-100%) 0% <0%> (-3%)
...adinstitute/hellbender/engine/ReferenceWalker.java 0% <0%> (-100%) 0% <0%> (-9%)
...ools/funcotator/filtrationRules/ClinVarFilter.java 0% <0%> (-100%) 0% <0%> (-7%)
...ls/walkers/varianteval/stratifications/Sample.java 0% <0%> (-100%) 0% <0%> (-4%)
...nes/metrics/QualityYieldMetricsCollectorSpark.java 0% <0%> (-100%) 0% <0%> (-7%)
... and 475 more

@codecov
Copy link

codecov bot commented Jun 17, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@20190cb). Click here to learn what that means.
The diff coverage is 18.142%.

@@            Coverage Diff             @@
##             master     #6004   +/-   ##
==========================================
  Coverage          ?   72.222%           
  Complexity        ?     27019           
==========================================
  Files             ?      2017           
  Lines             ?    151440           
  Branches          ?     16623           
==========================================
  Hits              ?    109373           
  Misses            ?     36760           
  Partials          ?      5307
Impacted Files Coverage Δ Complexity Δ
...titute/hellbender/tools/walkers/GenotypeGVCFs.java 0% <0%> (ø) 0 <0> (?)
...ellbender/tools/walkers/GenotypeGVCFsUnitTest.java 100% <100%> (ø) 21 <3> (?)
.../hellbender/tools/walkers/GenotypeGVCFsEngine.java 15.094% <15.094%> (ø) 17 <17> (?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants