Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Part1 Freec as default : Cnv consensus update #987

Closed
wants to merge 40 commits into from

Conversation

kgaonkar6
Copy link
Collaborator

@kgaonkar6 kgaonkar6 commented Apr 6, 2021

Purpose/implementation Section

What scientific question is your analysis addressing?

According to Bo's rationale since ploidy is based on ControlFREEC using its corresponding CN would probably result in a better estimate of overall CN.

What was your approach?

Just a change in

# Calculate summary stats from merged CNV calls. \
cnvs <- cnvs %>%
dplyr::mutate(cnvkit_df = purrr::map(cnvkit_CNVs, segstrings_to_df),
freec_df = purrr::map(freec_CNVs, segstrings_to_df),
segmean = purrr::map_dbl(cnvkit_df, segmean_function),
cnvkit_cn = purrr::map_dbl(cnvkit_df, copies_wmedian),
freec_cn = purrr::map_dbl(freec_df, copies_wmedian),
copynum = ifelse(is.finite(cnvkit_cn), # use cnvkit if available
cnvkit_cn, freec_cn), #otherwise use freec
num.mark = NA)

to use controlfreec if available if not use cnvkit:

                copynum = ifelse(is.finite(freec_cn), # use freec if available
                                 freec_cn, cnvkit_cn), #otherwise use cnvkit

What GitHub issue does your pull request address?

#964

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Is there anything that you want to discuss further?

👀 Downstream analysis to check as first pass can be reviewed here :

  1. 🚧 focal-cn-file-preparation will be affected by this change which is re-run with this update Part2 Freec as default: Cnv focal files update kgaonkar6/OpenPBTA-analysis#2 please note figures in plots is not updated.
  2. oncoprint-landscape uses consensus_seg_most_focal_cn_status.tsv.gz from focal-cn-file-preparation so we have also re-run Part3 Freec as default: Oncoprint update kgaonkar6/OpenPBTA-analysis#3 to look at changes across all samples with this update.
  3. cnv-chrom-plot was also re-run Part4 Freec as default: Chromosome wide CNV plots per histology kgaonkar6/OpenPBTA-analysis#4 with the controlfreec as default CN calls to look for genome wide changes.

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

table

What is your summary of the results?

By comparing pbta-cnv-consensus.seg.gz in this branch with v18 version file I found: 262 BS ids have atleast 1 segment copy.num updated . In total there are 2489 rows/regions per bs id that changed CN.
Columns named_latest are from the file in the branch and _previous is v18 release file
changelog.txt

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

@kgaonkar6 kgaonkar6 added the work in progress Used to label (non-draft) pull requests that are not yet ready for review label Apr 6, 2021
@kgaonkar6 kgaonkar6 changed the title Cnv consensus update Part1 Freec as default : Cnv consensus update Apr 6, 2021
@jharenza
Copy link
Collaborator

After doing some digging into the changelog.txt file above, I found the following and have updated a ticket in our D3b bixu tracker repo to ask for more information from @zhangb1. Copying below what I posted there (even though some links to the private repo can't be opened):

hi @zhangb1 @yuankunzhu @aadamk - I need to re-open this given the request to update this within OpenPBTA. @kgaonkar6 had been working on this issue and thus far has opened 4 PRs:

  1. Update consensus CNV calls to use ControlFREEC CN
  2. Update focal CNV files
  3. Update oncoprint
  4. Update chromosome wide plots - here, she has created a difference file and we see the following:
    a. 76% of calls have the same CN status after the change (new_status == 52610, same_status == 168895)
    b. There are 144 samples with calls having different CN status, and of these, there are 15 PNOC003 samples
    c. Of the PNOC samples, 0 genes in this goi list for the oncoprint are affected, thus, the oncoprints look identical using either ControlFREEC CN or CNVkit CN, below:

ControlFREEC CN
pnoc_primary_only_goi_oncoprint_freec
CNVkit CN
pnoc_primary_only_goi_oncoprint_cnvkit

I cannot reproduce the above plot with the large number of deletions using either of these codes from OpenPBTA.

My questions for @zhangb1 are:

  1. Have you tried using OpenPBTA consensus CNV + focal CNV code since May 2020? There may have been updates to these modules between May 2020 and now which may have resolved these deletions.
  2. When you say

only apply the tumor_ploidy from the histology file

in the comment above, what do you mean by that? The ploidy in the CNV module should match what is in the histology file.

  1. Based on this comment from @aadamk, Was there a time when you were not using ploidy from the histology file, but instead using directly from ControlFREEC's output file, and if so, were there discrepancies between the two?

General for @yuankunzhu and @zhangb1 - I think we should double check that all controlFreeC ploidy output matches what is in the histology file. I think this should still be the case because if it did not match and @zhangb1 was using info from ControlFREEC output and not the histology file, then I would expect I would see the large number of losses in my oncoprints.

@kgaonkar6 kgaonkar6 added the ready for review Used to label pull requests that are ready for review label May 5, 2021
Part4 Freec as default: Chromosome wide CNV plots per histology
Part3 Freec as default: Oncoprint update
Part2 Freec as default: Cnv focal files update
Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on the basis of the results discussed in #1010 and because the code changes in here address #964 and #1010. I haven't examined all the plots here and am going on trust!

@jaclyn-taroni jaclyn-taroni added merge next and removed ready for review Used to label pull requests that are ready for review labels May 11, 2021
@kgaonkar6
Copy link
Collaborator Author

I will rerun the oncoprint code because it seems in #1009 the file prefix was changed from ""all_participants_primary_only" to "primary_only" and "all_participants_primary-plus" to "primary-plus" which seem to be causing the conflicts

@kgaonkar6
Copy link
Collaborator Author

Just re-running didn't work so I removed the oncoprint-landscape in my branch and then checked out the most up-to-date( with #1009) module from master

git checkout origin/master oncoprint-landscape

and then ran the bash script but more conflicts have come up now 😩 . Any suggestions ?

@jaclyn-taroni
Copy link
Member

This branch was doing some weird stuff for me locally, so my suggestion is to start fresh - more info on next steps in #1064!

@kgaonkar6 kgaonkar6 closed this May 11, 2021
@kgaonkar6 kgaonkar6 reopened this May 11, 2021
@kgaonkar6
Copy link
Collaborator Author

I thought about this a little more and checked for all changes in #1009, one change that I had not pulled from remote cnv-consensus-update (changed in #1009) was change infigures/palettes/oncoprint_color_palette.tsv thus the figures used the old palette and thus causing conflicts.

I did the following :

# gather changes from #1009 added in master
git checkout origin/master oncoprint-landscape
# gather all the changes from master merged in remote cnv-consensus-update
git pull origin cnv-consensus-update
# run oncoprint-landscape/run-oncoprint.sh
docker exec -ti cnv-rerun bash -c "cd /home/rstudio/OpenPBTA/analyses/ && oncoprint-landscape/run-oncoprint.sh
# commit oncoprint-landscape
git push origin cnv-consensus-update

@kgaonkar6
Copy link
Collaborator Author

related figures/palettes/histology_label_color_table.tsv was updated during v19 release which is used in cnv-chrom-plot module.

I think I'll just rerun the whole analysis as you suggested before to capture all v19 changes. Closing this PR

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants