Updated analysis: Update CN consensus calls to use ControlFREEC CN as default? #964

jharenza · 2021-03-23T21:13:07Z

What analysis module should be updated and why?

Early on, we had noticed many gains/losses in our oncoprint, but had since created a focal CN file module. Still, while the oncoprints look better, there are still a lot of gains/losses within the TMB plots listed above the oncoprints, and worth mentioning/trying the below.

What changes need to be made? Please provide enough detail for another participant to make the update.

@zhangb1 noticed that for PNOC003 samples, there were still many losses being called for samples which, in some cases, had known amplifications (from clinical sequencing). He has summarized his findings here. His rationale was that ploidy is based on ControlFREEC and thus, using its corresponding CN would probably result in a better estimate of overall CN.

This piece of code:

OpenPBTA-analysis/analyses/copy_number_consensus_call/scripts/bed_to_segfile.R

Lines 119 to 128 in 2c1f5fa

    
           # Calculate summary stats from merged CNV calls. \ 
        
           cnvs <- cnvs %>% 
        
             dplyr::mutate(cnvkit_df = purrr::map(cnvkit_CNVs, segstrings_to_df), 
        
                           freec_df = purrr::map(freec_CNVs, segstrings_to_df), 
        
                           segmean = purrr::map_dbl(cnvkit_df, segmean_function), 
        
                           cnvkit_cn = purrr::map_dbl(cnvkit_df, copies_wmedian), 
        
                           freec_cn = purrr::map_dbl(freec_df, copies_wmedian), 
        
                           copynum = ifelse(is.finite(cnvkit_cn), # use cnvkit if available 
        
                                            cnvkit_cn, freec_cn), #otherwise use freec 
        
                           num.mark = NA)

would change to:

copynum = ifelse(is.finite(freec_cn), # use cnvkit if available
                                 freec_cn, cnvkit_cn), #otherwise use freec

What input data should be used? Which data were used in the version being updated?

data being used and to be updated:
consensus_seg_annotated_cn_autosomes.tsv.gz
consensus_seg_annotated_cn_x_and_y.tsv.gz
pbta-cnv-consensus.seg.gz

other data to be updated:
pbta-cnv-consensus-gistic.zip

When do you expect the revised analysis will be completed?

one week, if QC needed

Who will complete the updated analysis?

@kgaonkar6 or @jashapiro or @jharenza ?

Thoughts, @jashapiro and @jaclyn-taroni ?

The text was updated successfully, but these errors were encountered:

jaclyn-taroni · 2021-03-25T14:37:27Z

I am not opposed to this change in principle and would welcome a pull request. The evaluation would need to be more systematic (e.g., examine more if not all samples) and should be captured in this repository in a notebook.

jharenza · 2021-03-25T15:54:24Z

I am not opposed to this change in principle and would welcome a pull request. The evaluation would need to be more systematic (e.g., examine more if not all samples) and should be captured in this repository in a notebook.

Agreed- we need to examine.

kgaonkar6 · 2021-03-31T15:38:45Z

I'm getting an error at rule merge_all step running the code in docker as is from the master branch.

 (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

It seems like a simple command to cat all the files and seems to work fine when I run it on bash from docker exec so seems like a snakemake file specific bug?

jharenza · 2021-04-01T23:07:22Z

@jashapiro I know you are out this week, but when you are back, would you be able to help with the error above? Thanks!

jashapiro · 2021-04-02T10:59:36Z

I can try to help out. Is there any more detail you can give about exactly how you ran it @kgaonkar6? As far as I am aware, the workflow runs in CI, so there would have to be a pretty specific bug... as a first step, I would check that scratch/copy_consensus is empty before running the run_consensus_call.sh script.

kgaonkar6 · 2021-04-02T14:46:04Z

I ran the analysis with this command

docker exec -ti rerun-cn bash -c "cd /home/rstudio/OpenPBTA && analyses/copy_number_consensus_call/run_consensus_call.sh"

I did edit the script bed_tosegfile.R which is the last step rule make_segfile in the Snakemake file but the error occurs 2 steps before where all files are merged in step rule merge_all .

I will try a re-run with an empty scratch/copy_consensus, thanks for the input!

kgaonkar6 · 2021-04-02T15:09:50Z

re-running with an empty scratch/copy_consensus worked! 🎉 Thank you, should have tried that before!

jharenza · 2021-04-02T17:11:16Z

thanks @jashapiro

kgaonkar6 · 2021-04-02T19:09:29Z

Sorry for bringing this up again.. @jashapiro @jashapiro

The analysis ran successfully with testing data I had downloaded for another PR I'm working on, but it actually did error out in the same step when I ran it with v18 . I did empty scratch/copy_consensus before running the same command with docker exec above. The log file has the error:

Error in rule merge_all:

jharenza · 2021-05-13T09:55:35Z

Closed with #1066

jharenza added the updated analysis label Mar 23, 2021

jharenza mentioned this issue Mar 31, 2021

Update bgCol in oncoprint module #975

Merged

5 tasks

jashapiro mentioned this issue Apr 5, 2021

Fix CNV conensus call workflow #984

Merged

jharenza added the blocking release label Apr 13, 2021

This was referenced May 7, 2021

Part5 Freec as default: Gistic rerun kgaonkar6/OpenPBTA-analysis#5

Closed

Part1: Freec as default and neutral NA #1066

Merged

Part2: Freec as default and neutral NA update to focal-cn-file-preparation #1067

Merged

jharenza closed this as completed May 13, 2021

kgaonkar6 mentioned this issue Aug 9, 2021

Updated manuscript: copy_consensus analyses filtering update documentation AlexsLemonade/OpenPBTA-manuscript#132

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated analysis: Update CN consensus calls to use ControlFREEC CN as default? #964

Updated analysis: Update CN consensus calls to use ControlFREEC CN as default? #964

jharenza commented Mar 23, 2021 •

edited

Loading

jaclyn-taroni commented Mar 25, 2021

jharenza commented Mar 25, 2021

kgaonkar6 commented Mar 31, 2021

jharenza commented Apr 1, 2021

jashapiro commented Apr 2, 2021

kgaonkar6 commented Apr 2, 2021

kgaonkar6 commented Apr 2, 2021

jharenza commented Apr 2, 2021

kgaonkar6 commented Apr 2, 2021 •

edited

Loading

jharenza commented May 13, 2021

Updated analysis: Update CN consensus calls to use ControlFREEC CN as default? #964

Updated analysis: Update CN consensus calls to use ControlFREEC CN as default? #964

Comments

jharenza commented Mar 23, 2021 • edited Loading

What analysis module should be updated and why?

What changes need to be made? Please provide enough detail for another participant to make the update.

What input data should be used? Which data were used in the version being updated?

When do you expect the revised analysis will be completed?

Who will complete the updated analysis?

jaclyn-taroni commented Mar 25, 2021

jharenza commented Mar 25, 2021

kgaonkar6 commented Mar 31, 2021

jharenza commented Apr 1, 2021

jashapiro commented Apr 2, 2021

kgaonkar6 commented Apr 2, 2021

kgaonkar6 commented Apr 2, 2021

jharenza commented Apr 2, 2021

kgaonkar6 commented Apr 2, 2021 • edited Loading

jharenza commented May 13, 2021

jharenza commented Mar 23, 2021 •

edited

Loading

kgaonkar6 commented Apr 2, 2021 •

edited

Loading