-
Notifications
You must be signed in to change notification settings - Fork 67
PR 3 of n - Molecular subtyping embryonal tumors (Wrangle 'final' table) #458
PR 3 of n - Molecular subtyping embryonal tumors (Wrangle 'final' table) #458
Conversation
Tagging @jharenza to weigh in. |
We use the consensus SEG file, rather than the annotated version from `focal-cn-file-preparation`
Use new fusion-summary file committed to repo
Copy number, SV files
In the last few commits, I tried to look into the BCOR internal tandem duplications and I'm also using the new |
Hi @jaclyn-taroni - I think this looks really great! A few comments and notes to myself following along this notebook. "First subset to embryonal tumors, excluding any derived cell lines." - there are no cell lines in this dataset, but future code maybe should not be as restrictive. Re: C19MC coordinates, I found a few publications with older genome coordinates and lifted over, but there is not a great consistency in these.
From R biomaRt:
"Granted, we don’t currently have enough information to look specifically at CNS HGNET-BCOR and therefore perhaps can not classify any tumor as CNS Embryonal, NOS."" - I would argue that if we do not have enough information, the remaining samples be classified as CNS Embryonal, NOS. "For some samples that have a TTYH1 fusion, we do not have DNA data to check for C19MC amplification." - According to this paper, "Embryonal tumors with multilayered rosettes (ETMRs) are rare, deadly pediatric brain tumors characterized by high-level amplification of the microRNA cluster C19MC. We performed integrated genetic and epigenetic analyses of 12 ETMR samples and identified, in all cases, C19MC fusions to TTYH1 driving expression of the microRNAs.", all of the cases with C19MC had TTYH1 fusions, which were causal in the miRNA cluster amplification, so I think it is safe to label all of these as C19MC-altered. Manual check for BS_69VS8PS1, BS_TE8QFF7T, and BS_K07KNTFY DNA: Great catch on the MN1 fusions! It looks like For For From here, do you want to add a file with subtypes? |
@jharenza - including cell lines makes the mapping between RNA-seq and DNA-seq samples a little more difficult, you would have to also join on the sample composition. Excluding them is consistent with how we've approached other subtyping efforts because of this difficulty. |
With visualization!
Okay @jharenza - I'm going to call every sample with a TTYH1 fusion ETMR, C19MC-altered. What about |
I went with ETMR, NOS for now. Here is the table with subtype labels: https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/101b1c4470abfbd0e3d14281b3705d1fcb2bfe00/analyses/molecular-subtyping-embryonal/results/embryonal_tumor_molecular_subtypes.tsv I broke out the C19MC cleaning steps into a notebook with visualizations and so I could capture the table with coordinates from #458 (comment). That notebook is available here: https://jaclyn-taroni.github.io/openpbta-notebook-concept/03-clean-c19mc-data.nb.html New version of the notebook with the subtyping calls is here: https://jaclyn-taroni.github.io/openpbta-notebook-concept/04-table-prep.nb.html |
This should be ETMR, NOS, you are right! |
Visualizations look great and so does the subtyping table! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good!
Oh wait @jaclyn-taroni, I just realized I think we should add |
Fixed in e8d4148! |
Awesome, you're so quick! All good now! |
Purpose/implementation Section
What scientific question is your analysis addressing?
Molecular subtyping of non-MB, non-ATRT embryonal tumors
What was your approach?
The notebook I'm adding here wrangles the relevant fusion, expression, and copy number data listed on the on #251. It does not currently include information about the presence or absence BCOR tandem duplications, as I expect that would require some cleaning of the structural variant data that hasn't been done yet.
You can view the rendered version of the notebook here: https://jaclyn-taroni.github.io/openpbta-notebook-concept/03-table-prep.nb.html
You can view the table with all the relevant data here: https://github.com/jaclyn-taroni/OpenPBTA-analysis/blob/d024b044255e2fa1073b9ffff393ebf9bef76cfc/analyses/molecular-subtyping-embryonal/results/embryonal_tumor_subtyping_relevant_data.tsv
What GitHub issue does your pull request address?
#251
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
wrangle_fusions
function, chr19 amplificationIs there anything that you want to discuss further?
What else, if anything, needs to be added to complete calls as part of #251?
Results
What is your summary of the results?
Please see: https://jaclyn-taroni.github.io/openpbta-notebook-concept/03-table-prep.nb.html#subtyping
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.