-
Notifications
You must be signed in to change notification settings - Fork 67
v20 CNV update part4 : Rerun gistic and molecular subtyping v20 #1127
v20 CNV update part4 : Rerun gistic and molecular subtyping v20 #1127
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me - changes are minimal/expected and criteria is still stringent.
Just wanted to document here that I reran focal-cn here since we added a logic to read relative 5c33bbe path only when subtyping and the html files needed to be updated |
@jaclyn-taroni, sorry to reach out to you like this but I have a question about running the GISTIC module - have you ever get an error like this:
Looks like this is coming from changing copy number from 2 to NA for neutral calls (in the |
Hi @runjin326 - I have never encountered this specific error unfortunately. I assume that this part of the error message
Means that all of the inputs are being converted to NaN during some internal GISTIC step. You might need to do what was done in the OpenPBTA-analysis/analyses/chromothripsis/02-run-shatterseek-and-classify-confidence.R Line 67 in d31c927
|
Thanks so much for the prompt reply - I will try that method :) |
@jaclyn-taroni, I am so sorry but I have another related question -for OpenTargets, we made modifications to run consensus on WGS only and use CNVkit results for WXS samples only - my question is, do you see any issue with running GISTIC directly on CNVkit seg file? Also, when I was trying to run it, I realized that there are some weird chromosome names in the file:
Have we been ignoring those when calling consensus? Or how do we deal with these? |
I'll preface this by saying that I do not have a lot of experience running GISTIC on different datasets. A concern that comes to mind for me is whether or not GISTIC expects genome-wide measurements and if CNVkit on WXS provides genome-wide measurements. If GISTIC does expect genome-wide measurements and if CNVkit does not provide them, I would be concerned that the input data is then violating some assumptions. I assume that we do not consider anything outside of the primary assembly for CN consensus based on what is in this file, which are the genomic regions that are callable in that pipeline: https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/d31c927a27813ec0b8032fbe768002f31723636f/analyses/copy_number_consensus_call/ref/cnv_callable.bed But I would need to ask someone more involved in writing that pipeline to be sure. |
Thanks so much! I will definitely look into their assumptions about whether they expect genome-wide measurements. If you get a chance, could you please also check with someone that was involved in writing the pipeline as well? Greatly appreciate that! |
I am seeing the same error now and the gistic run only updates the
I believe the suggested changes by @jaclyn-taroni might work, I can also open a ticket for the rerun for gistic module (this will also affect the HGG and EPN subtyping modules ). |
@jaclyn-taroni and @kgaonkar6, I actually looked into the original publication for GISTIC v2.0 and they specified their testing data as followed:
Looks like they used SNP array so I am assuming they don't have an assumption for genome-wide measurements and it would be fine to run on WXS samples? |
That array is genome-wide. |
@jaclyn-taroni, oh right! I kept digging more and asking around but still couldn't figure out whether it takes panel or WXS. Please let me know if anyone knows the answer! |
Yes, all of that happens here. |
Thanks so much! |
@runjin326 mitochondrial and alt sequences are removed in the consensus pipeline: OpenPBTA-analysis/analyses/copy_number_consensus_call/Snakefile Lines 135 to 137 in d31c927
For any other questions that are not directly related to this PR but are related to data in this project (OpenPBTA) specifically, I'd recommend filing a data question issue: https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/new?assignees=&labels=data&template=data-question.md&title= |
@jaclyn-taroni, thanks so much! Sure I wasn't sure where to post it but will submit a data question ticket next time! :) |
Please merge #1123 #1124 #1126 before this PR.
Purpose/implementation Section
What scientific question is your analysis addressing?
Rerun subtyping for v20 with updated CNV #1114
What was your approach?
Rerun all molecular subtypes with the updated run-for-sutyping.sh which now included cnv modules required for subtyping
What GitHub issue does your pull request address?
#1125
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
See summary for discussion about changes.
Is there anything that you want to discuss further?
Expected changes:
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
Yes
Results
What types of results are included (e.g., table, figure)?
tables
What is your summary of the results?
BS_DW1CYEXP, BS_QDGHHS4S,BS_Z9PKZ4RT
For example I checked one instance where the focal annotation was loss in CDKN2A because of CN set to 2 in consensus seg file but checking in cnvkit and freec didn't see any calls supporting the loss
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.