AlexsLemonade · jaclyn-taroni · Jul 21, 2021 · Jul 19, 2021 · Jul 20, 2021 · Jul 20, 2021
diff --git a/analyses/copy_number_consensus_call/README.md b/analyses/copy_number_consensus_call/README.md
@@ -52,11 +52,12 @@ The per-sample pipeline revolves around the use of Snakemake to run analysis for
 3) Create a `config_snakemake.yaml` that contains all of the samples names to run the Snakemake pipeline
 4) Run the Snakemake pipeline to perform analysis **per sample**. 
 5) Filter for any CNVs that are over a certain **SIZE_CUTOFF** (default 3000 bp)
-6) Filter for any **significant** CNVs called by Freec (default pval = 0.01)
-7) Filter out any CNVs that overlap 50% or more with **Immunoglobulin, telomeric, centromeric, seg_dup regions** as found in the file `ref/cnv_excluded.bed`
-8) Merge any CNVs of the same sample and call method if they **overlap or within 10,000 bp** (We consider CNV calls within 10,000 bp the same CNV)
-9) Reformat the columns of the files (So the info are easier to read)
-10) **Call consensus** by comparing CNVs from 2 call methods at a time. 
+6) Filter for any **significant** CNVs called by Freec (default pval = 0.01) 
+7) Filter to keep manta calls that **PASS** all filters 
+8) Filter out any CNVs that overlap 50% or more with **Immunoglobulin, telomeric, centromeric, seg_dup regions** as found in the file `ref/cnv_excluded.bed`
+9) Merge any CNVs of the same sample and call method if they **overlap or within 10,000 bp** (We consider CNV calls within 10,000 bp the same CNV)
+10) Reformat the columns of the files (So the info are easier to read)
+11) **Call consensus** by comparing CNVs from 2 call methods at a time. 
 
 Since there are 3 callers, there were 3 comparisons: `manta-cnvkit`, `manta-freec`, and `cnvkit-freec`. If a CNV from 1 caller **overlaps 50% or more** with at least 1 CNV from another caller, the common region of the overlapping CNV would be the new CONSENSUS CNV.
 

diff --git a/analyses/copy_number_consensus_call/Snakefile b/analyses/copy_number_consensus_call/Snakefile
@@ -98,10 +98,10 @@ rule manta_filter:
         ## the first awk also filters out for CNV length
         ## The sort command sorts the first digit of chromosome number numerically
         ## The last pipe is to introduce tab into the file and output file name.        
-        """awk '$6~/DEL/ {{if ($5 > {params.SIZE_CUTOFF}) {{print "chr"$2,$3,$4,$5,"NA","NA","NA",$6}}}}' {input} """
+        """awk '$6~/DEL/ {{if ($5 > {params.SIZE_CUTOFF} && $11 == 'PASS') {{print "chr"$2,$3,$4,$5,"NA","NA","NA",$6}}}}' {input} """
         """ | sort -k1,1 -k2,2n """
         """ | tr [:blank:] '\t' > {output.manta_del} && """
-        """awk '$6~/DUP/ {{if ($5 > {params.SIZE_CUTOFF}) {{print "chr"$2,$3,$4,$5,"NA","NA","NA",$6}}}}' {input} """
+        """awk '$6~/DUP/ {{if ($5 > {params.SIZE_CUTOFF}  && $11 == 'PASS') {{print "chr"$2,$3,$4,$5,"NA","NA","NA",$6}}}}' {input} """
         """ | sort -k1,1 -k2,2n """
         """ | tr [:blank:] '\t' > {output.manta_dup}"""
 
@@ -330,4 +330,4 @@ rule make_segfile:
         " -i {input.consensus}"
         " -n {input.neutral}"
         " -u {input.uncalled}"
-        " -o {output}"
+        " -o {output}"