Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

CN Status Heatmap (PR 2 of 2) #603

Merged
merged 82 commits into from
May 15, 2020

Conversation

cansavvy
Copy link
Collaborator

@cansavvy cansavvy commented Mar 4, 2020

Purpose/implementation Section

What scientific question is your analysis addressing?

We wanted a summary visualization of copy number status.

This second PR has the main notebook where the functions from the previous PR are implemented.

What was your approach?

Create a summary heatmap of copy number status from the consensus CNV call data.
This is done by binning the genome and calculating the segment's coverage of the
CNV consensus segments.
A bin is declared a particular copy number status if that status's base pair
coverage is a certain threshold percentage larger than the other statuses'
coverage.

What GitHub issue does your pull request address?

Issue: #594

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Results

See the rendered notebook here:
https://cansavvy.github.io/openpbta-notebook-concept/cnv-chrom-plot/cn_status_heatmap.nb.html

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

**These items already existed but have been updated in #602

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

@cansavvy cansavvy added the work in progress Used to label (non-draft) pull requests that are not yet ready for review label Mar 4, 2020
@cansavvy cansavvy requested review from jashapiro and cbethell and removed request for jashapiro April 13, 2020 19:29
Copy link
Contributor

@cbethell cbethell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @cansavvy!

I have some comments below around clarifying some steps.

analyses/cnv-chrom-plot/cn_status_heatmap.Rmd Outdated Show resolved Hide resolved
analyses/cnv-chrom-plot/cn_status_heatmap.Rmd Show resolved Hide resolved
analyses/cnv-chrom-plot/cn_status_heatmap.Rmd Outdated Show resolved Hide resolved
analyses/cnv-chrom-plot/cn_status_heatmap.Rmd Outdated Show resolved Hide resolved
analyses/cnv-chrom-plot/cn_status_heatmap.Rmd Outdated Show resolved Hide resolved
@cansavvy cansavvy requested a review from cbethell April 21, 2020 13:31
Copy link
Contributor

@cbethell cbethell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cansavvy, this looks just about ready to merge to me!

I did however notice that the bottom CN status annotation bar on the final heatmap does not appear to reflect all of the calls (it appears to reflect only losses and gains although the legend also includes neutral, unstable and uncallable).

That being said, is the bottom annotation based on the majority call for a whole chromosome? If so, would it be possible to add the chromosome labels to make it clear that the bottom annotation bar is relevant to each individual chromosome (although I am sure that this will be included in the figure description).

I am going to approve this PR because I believe it can be merged without said labels 👍 although they would be nice.

@cansavvy
Copy link
Collaborator Author

I did however notice that the bottom CN status annotation bar on the final heatmap does not appear to reflect all of the calls (it appears to reflect only losses and gains although the legend also includes neutral, unstable and uncallable).

I'm not seeing this? When I'm looking at the plot it looks like all of the labels are in the legend? Sometimes the Markdown document cuts off part of it in the preview, but if you look at the pdf, it looks fine.

@cbethell
Copy link
Contributor

I'm not seeing this? When I'm looking at the plot it looks like all of the labels are in the legend? Sometimes the Markdown document cuts off part of it in the preview, but if you look at the pdf, it looks fine.

The legend looks fine, I am referring to the bottom annotation red and blue bar. That bottom bar reflects the chromosomal majority CN status, is that correct?

frac_loss > threshold ~ "loss",
frac_neutral > threshold ~ "neutral",
TRUE ~ "unstable"
frac_uncallable > frac_uncallable_val ~ "uncallable",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we wanted here? Or should we keep what's in master?

Copy link
Member

@jashapiro jashapiro May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is correct, as it now matches the function arguments. However, the docs for the function should also be changed to match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you were looking at the order of the final two options, but I think that what you have is correct, because you got strange results the other way? I don't fully remember.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think we had settled on this. https://alexslemonade.slack.com/archives/CNH4FND1C/p1586786889017900

Will make sure docs are updated though.

@jashapiro
Copy link
Member

I'm not seeing this? When I'm looking at the plot it looks like all of the labels are in the legend? Sometimes the Markdown document cuts off part of it in the preview, but if you look at the pdf, it looks fine.

The legend looks fine, I am referring to the bottom annotation red and blue bar. That bottom bar reflects the chromosomal majority CN status, is that correct?

I think what you are looking at is just the visual cue for where each chromosome starts and ends. Perhaps the color should be changed to avoid confusion with calls though.

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! A few minor comments, and one with substance: lets see what happens if we upgrade the resolution of the final figure.

Comment on lines 119 to 124
seg_data <- data.table::fread(file.path(
input_dir,
"pbta-cnv-consensus.seg.gz"
),
data.table = FALSE
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation here is strange.

Suggested change
seg_data <- data.table::fread(file.path(
input_dir,
"pbta-cnv-consensus.seg.gz"
),
data.table = FALSE
)
seg_data <- data.table::fread(
file.path(
input_dir,
"pbta-cnv-consensus.seg.gz"
),
data.table = FALSE
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran styler on it. It doesn't always make perfect choices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad styler, no biscuit!

Comment on lines 130 to 138
seg_data <- seg_data %>%
# Join the histology column to this data
dplyr::inner_join(dplyr::select(
metadata,
"Kids_First_Biospecimen_ID",
"short_histology",
"tumor_ploidy"
),
by = c("ID" = "Kids_First_Biospecimen_ID")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing... try not to open 2 sets of parens on the same line that don't then close together, as it makes for non-semantic indentation.

Suggested change
seg_data <- seg_data %>%
# Join the histology column to this data
dplyr::inner_join(dplyr::select(
metadata,
"Kids_First_Biospecimen_ID",
"short_histology",
"tumor_ploidy"
),
by = c("ID" = "Kids_First_Biospecimen_ID")
seg_data <- seg_data %>%
# Join the histology column to this data
dplyr::inner_join(
dplyr::select(
metadata,
"Kids_First_Biospecimen_ID",
"short_histology",
"tumor_ploidy"
),
by = c("ID" = "Kids_First_Biospecimen_ID")

show_row_names = FALSE,
bottom_annotation = chr_annot,
right_annotation = hist_annot,
heatmap_legend_param = list(nrow = 1)
Copy link
Member

@jashapiro jashapiro May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heatmap itself is drawn rasterized, which is mostly fine, but may be a bit lower resolution than ideal. Can we see what happens if we bump it up, both to file size and image quality? You might even go higher than this, depending on results. It is a bit hard to interpret the docs https://jokergoo.github.io/ComplexHeatmap-reference/book/a-single-heatmap.html#heatmap-as-raster-image, but if the general setting is equivalent to 72dpi, we might want to go as far as 4 here.

Suggested change
heatmap_legend_param = list(nrow = 1)
heatmap_legend_param = list(nrow = 1),
raster_quality = 2

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sharp!

@jashapiro jashapiro merged commit 9bc1a10 into AlexsLemonade:master May 15, 2020
jashapiro added a commit that referenced this pull request May 15, 2020
@cansavvy cansavvy deleted the cn-status-heatmap branch August 13, 2020 11:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants