Reciprocal and kinase #821

kgaonkar6 · 2020-10-21T15:42:34Z

Purpose/implementation Section

What scientific question is your analysis addressing?

Add kinase domain retention status for fused genes since this information will be needed to filter BRAF and other kinase gene fusions that we use for LGAT subtyping. We also want to check if the fusion is a reciprocal that is if the fusion callers called GeneX--GeneY and GeneY--GeneX.

What was your approach?

First, I added the LeftBreakpoint and RightBreakpoint column since we need this information to annotate domain retention. (Earlier ,we had removed these columns so that 1 unique fusion row per Sample could be retained)

Then, we will be using fusion_driver function from annoFuse to add kinase domain status per Gene1A (5 Gene) and Gene1B (3 Gene) in columns
DomainRetainedGene1A and DomainRetainedGene1B.For each kinase gene the Domain retention annotation will be as follows

Domain retention annotation	Description
DomianRetainedGene1A == Yes	LeftBreakpoint downstream of domain end in any fusion
DomianRetainedGene1A == Partial	LeftBreakpoint within domain boundaries in any fusion
DomianRetainedGene1A == No	LeftBreakpoint upstream of domain start in any fusion
DomianRetainedGene1B == Yes	RightBreakpoint upstream of domain start in in-frame fusion
DomianRetainedGene1B == Partial	RightBreakpoint within domain boundaries in in-frame fusion
DomianRetainedGene1B == No	RightBreakpoint downstream of domain end in any fusion

Within the function the base function pfam domain annotation annotates the retention status of domains per breakpoint and domain ID & Location information from :

Annotation	File	Source
pfamID	http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/pfamDesc.txt.gz	UCSC pfamID Description database
Domain Location	http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/ucscGenePfam.txt.gz	UCSC pfamID Description database

For reciprocal status I've added a function to add that information as logical values to a separate column reciprocal_exists. For sample in Sample BS_044XZ8ST we have reciprocal fusion ANTXR1--BRAF and BRAF --ANTXR1 so these fusions will be reciprocal_exists== TRUE

What GitHub issue does your pull request address?

#812

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

I've tried to organize the chunks in 04-project-specific-filtering.Rmd so that there are minimal code changes, please let me know if it is easy enough to follow.

Is there anything that you want to discuss further?

Since I've now added the LeftBreakpoint and RightBreakpoint columns to pbta-fusion-putative-oncogenic.tsv there can be multiple rows per FusionName and Sample if they have multiple breakpoints for the fusion. It doesn't affect the *recurrent-fusion-byhistology.tsv, *recurrent-fused-genes-byhistology.tsv, *recurrent-fusion-bysamplee.tsv and *recurrent-fused-genes-bysampletsv but might affect some other modules that don't unique for FusionName Sample

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

table

What is your summary of the results?

Kinase domain retention information is added per kinase gene fusion

Reproducibility Checklist

The dependencies required to run the code in this pull request have been added to the project Dockerfile.
This analysis has been added to continuous integration.

Documentation Checklist

This analysis module has a README and it is up to date.
This analysis is recorded in the table in analyses/README.md and the entry is up to date.
The analytical code is documented and contains comments.

jaclyn-taroni · 2020-10-26T23:18:49Z

Are the changes in analyses/fusion_filtering/results/pbta-fusion-recurrent-fusion-bysample.tsv and analyses/fusion_filtering/results/pbta-fusion-recurrently-fused-genes-bysample.tsv expected or unexpected?

jaclyn-taroni

This looks good! I had a few questions before I approve.

analyses/fusion_filtering/04-project-specific-filtering.Rmd

analyses/fusion_filtering/README.md

jaclyn-taroni · 2020-10-26T23:30:12Z

analyses/fusion_filtering/04-project-specific-filtering.Rmd

+# check for fusions have reciprocal fusions in the same Sample
+# works only for GeneY -- GeneX ; GeneX -- GeneY matches
+recirpocal_fusion <- function(FusionName,Sample,standardFusioncalls ){
+  Gene1A <- strsplit(FusionName,"--")[[1]][1]


Can you remind me why we're not looking at the Gene2A and Gene2B here?

For intergenic fusions, which has Gene1A/Gene2A--Gene1B or Gene1A--Gene1B/Gene2B and similar fusions it get's a little complicated since then we need to check the distance is the same between Gene1A/Gene2A in the reciprocal so I've just stuck to fusions between genes.

Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

kgaonkar6 · 2020-10-29T17:50:08Z

Are the changes in analyses/fusion_filtering/results/pbta-fusion-recurrent-fusion-bysample.tsv and analyses/fusion_filtering/results/pbta-fusion-recurrently-fused-genes-bysample.tsv expected or unexpected?

This is expected because of the different sample selection issues by running sample() when we have multiple samples per Kids_First_Participant_ID.

jaclyn-taroni · 2020-11-04T22:27:16Z

This is expected because of the different sample selection issues by running sample() when we have multiple samples per Kids_First_Participant_ID.

I would have expected sorting + setting a seed to have prevented that, but there may be some subtlety I'm missing. Either way, beyond the scope of this PR.

jaclyn-taroni

👍

kgaonkar6 · 2020-11-04T22:50:31Z

Thanks for the review @jaclyn-taroni !

kgaonkar6 added 2 commits October 21, 2020 09:43

adding kinase and reciprocal status

cc01238

re-run bash script

695e789

kgaonkar6 requested review from sjspielman and jaclyn-taroni October 21, 2020 15:43

kgaonkar6 and others added 2 commits October 21, 2020 11:50

update README

2d6da47

Merge branch 'master' into reciprocal_and_kinase

a622da8

kgaonkar6 added the ready for review Used to label pull requests that are ready for review label Oct 26, 2020

jaclyn-taroni reviewed Oct 26, 2020

View reviewed changes

jharenza mentioned this pull request Oct 28, 2020

Updated analysis: fusion summary to add fusions for LGG #808

Closed

kgaonkar6 and others added 4 commits October 29, 2020 13:27

Update analyses/fusion_filtering/04-project-specific-filtering.Rmd

98ebb02

Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

Update analyses/fusion_filtering/04-project-specific-filtering.Rmd

503e991

Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

Update analyses/fusion_filtering/04-project-specific-filtering.Rmd

6c1a15e

Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

Update analyses/fusion_filtering/README.md

cfa77dd

Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

Merge branch 'master' into reciprocal_and_kinase

635d781

kgaonkar6 requested a review from jaclyn-taroni November 4, 2020 22:18

jaclyn-taroni approved these changes Nov 4, 2020

View reviewed changes

Merge branch 'master' into reciprocal_and_kinase

fab4919

jaclyn-taroni merged commit d2f34a3 into AlexsLemonade:master Nov 5, 2020

This was referenced Nov 6, 2020

Updated analysis: Fusion Filtering - add reciprocal and kinase domain #812

Closed

update subset files for lgat fusions #835

Merged

This was referenced Nov 23, 2020

V18 release #849

Closed

add v18 release #857

Merged

This was referenced Dec 1, 2020

Updated analysis: RNA summary files and base histology subtyping update for v18 #861

Closed

PBTA Histologies: Fusion summary base (4 of N) #866

Merged

kgaonkar6 deleted the reciprocal_and_kinase branch December 8, 2020 21:19

jaclyn-taroni mentioned this pull request Dec 13, 2020

v18 CI files #871

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reciprocal and kinase #821

Reciprocal and kinase #821

kgaonkar6 commented Oct 21, 2020 •

edited

Loading

jaclyn-taroni commented Oct 26, 2020

jaclyn-taroni left a comment

jaclyn-taroni Oct 26, 2020

kgaonkar6 Oct 29, 2020

kgaonkar6 commented Oct 29, 2020

jaclyn-taroni commented Nov 4, 2020

jaclyn-taroni left a comment

kgaonkar6 commented Nov 4, 2020

Reciprocal and kinase #821

Reciprocal and kinase #821

Conversation

kgaonkar6 commented Oct 21, 2020 • edited Loading

Purpose/implementation Section

What scientific question is your analysis addressing?

What was your approach?

What GitHub issue does your pull request address?

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Is there anything that you want to discuss further?

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Results

What types of results are included (e.g., table, figure)?

What is your summary of the results?

Reproducibility Checklist

Documentation Checklist

jaclyn-taroni commented Oct 26, 2020

jaclyn-taroni left a comment

Choose a reason for hiding this comment

jaclyn-taroni Oct 26, 2020

Choose a reason for hiding this comment

kgaonkar6 Oct 29, 2020

Choose a reason for hiding this comment

kgaonkar6 commented Oct 29, 2020

jaclyn-taroni commented Nov 4, 2020

jaclyn-taroni left a comment

Choose a reason for hiding this comment

kgaonkar6 commented Nov 4, 2020

kgaonkar6 commented Oct 21, 2020 •

edited

Loading