Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

PR 2 of 4 for SNV Consensus: Merge callers script #184

Merged
merged 180 commits into from
Oct 30, 2019
Merged
Show file tree
Hide file tree
Changes from 173 commits
Commits
Show all changes
180 commits
Select commit Hold shift + click to select a range
37a78fb
Set up the set up
Sep 23, 2019
a25be13
Add circle CI test and Docker config
Sep 23, 2019
4a40580
Add some more comments
Sep 23, 2019
8a4d33b
Merge remote-tracking branch 'upstream/master' into cansav09/snv-call…
Sep 23, 2019
e704c0b
Set up Rprojroot for circle CI test to work better
Sep 23, 2019
56014dd
Fixing Circle CI file.
Sep 23, 2019
03a0770
Change read_tsv to data.table::fread for big file
Sep 23, 2019
391fb52
read in the .gz file
Sep 24, 2019
93cb818
push plot function changes
Sep 24, 2019
908aff9
Fix an error
Sep 25, 2019
ee08152
Merge branch 'master' into cansav09/snv-caller_set_up
cansavvy Sep 26, 2019
6f18dff
Add missing package to Dockerfile
Sep 26, 2019
38cd490
Reduce cosmic file to only the brain sample mutations
Sep 26, 2019
b9d8333
Update README with changes to cosmic file
Sep 26, 2019
a7c5495
re-updated Dockerfile
Sep 26, 2019
efde7ef
Ran a linter on set up script
Sep 26, 2019
e2b43a8
Merge branch 'master' into cansav09/snv-caller_set_up
cansavvy Sep 26, 2019
f4c7534
Comment out of date
cansavvy Sep 26, 2019
83d001a
Get rid of old WGX/WXS bed file set up
Sep 26, 2019
30664ec
Merge branch 'master' into cansav09/snv-caller_set_up
cansavvy Sep 27, 2019
374c6a1
Incorporate initial PR suggestions from @jashapiro and @cbethell
Sep 27, 2019
e99d0b4
Push a working bash script
Sep 27, 2019
5cb619e
Add bash script to circle CI
Sep 27, 2019
5e79092
Add usage section in README and change name of script
Sep 30, 2019
0b9f4e6
Merge branch 'master' into cansav09/snv_calculations
cansavvy Sep 30, 2019
83cad86
Add some more comments
Sep 30, 2019
f383869
Merge remote-tracking branch 'origin/cansav09/snv_calculations' into …
Sep 30, 2019
2e908a1
Correct a couple things in the README
Sep 30, 2019
1146afc
Get rid of remnant comment
Sep 30, 2019
abb7dda
Fix a typo!
Sep 30, 2019
76ec476
Merge branch 'master' into cansav09/snv_calculations
cansavvy Sep 30, 2019
20f5d4c
Add Usage to the TOC
Sep 30, 2019
b63c69f
Merge branch 'master' into cansav09/snv_calculations
cansavvy Sep 30, 2019
cf9999c
Add more documentation to the README
Oct 1, 2019
068b08c
Update Circle CI
Oct 1, 2019
e9a5bde
Push more exact bash script
Oct 1, 2019
efd2264
Fix a couple issues with handling metadata file path
Oct 1, 2019
e732516
Get rid of dev remnants
Oct 1, 2019
bab4319
Couple changes for readability
Oct 1, 2019
655bc00
Merge branch 'master' into cansav09/snv_calculations
cansavvy Oct 1, 2019
748e677
Add some things to README and get rid of part of bash that isn't there
Oct 1, 2019
9511997
Merge remote-tracking branch 'origin/cansav09/snv_calculations' into …
Oct 1, 2019
d25cc45
Found dumb mistake
Oct 1, 2019
ff50ff9
Changed [*] to [+]
Oct 1, 2019
26cc61d
I mean [*] to a [@] which makes more sense.
Oct 1, 2019
f0916eb
Merge branch 'master' into cansav09/snv_calculations
cansavvy Oct 2, 2019
0c0a023
Fix some of the overwrite handling
Oct 3, 2019
4666b00
Merge remote-tracking branch 'origin/cansav09/snv_calculations' into …
Oct 3, 2019
0fa591d
Add template and report script
Oct 3, 2019
84c0f4b
Add run_eval to bash script
Oct 3, 2019
cc5494c
Merge branch 'master' into cansav09/snv-add-eval
cansavvy Oct 4, 2019
fb6ec2a
Get rid of dev remnants
Oct 4, 2019
7e43739
Make better handling of strategy that is called but not there
Oct 4, 2019
1acd00b
Get rid of dev remnant again
Oct 4, 2019
5cce1fe
Get rid of stray `\`
Oct 4, 2019
bffb002
File got misplaced
Oct 4, 2019
b2b2eb4
Get rid of cosmic mutations without proper coordinates
Oct 4, 2019
8800d8d
Fixed a wrinkle with the COSMIC file
Oct 4, 2019
2999c65
Merge branch 'master' into cansav09/snv-add-eval
cansavvy Oct 4, 2019
79fcdaa
re-run a linter on everything.
Oct 4, 2019
4460630
Merge remote-tracking branch 'origin/cansav09/snv-add-eval' into cans…
Oct 4, 2019
689583a
Couple more minor touches
Oct 4, 2019
983eefb
Make set up files not run if they are already existing
cansavvy Oct 7, 2019
fb1a6ad
Fix handling of COSMIC file creation
cansavvy Oct 7, 2019
e67e51e
Incorporate @jashapiro 's suggestions
Oct 8, 2019
87eba81
Add a few more @jashapiro suggestions
Oct 8, 2019
7c0d87c
Merge branch 'master' into cansav09/snv-add-eval
cansavvy Oct 8, 2019
f4cb768
Circle CI does not have a kitematic directory. Get rid
Oct 8, 2019
6d9f09f
Merge remote-tracking branch 'origin/cansav09/snv-add-eval' into cans…
Oct 8, 2019
86b2930
Make warning instead of stop
Oct 8, 2019
c0446e4
Missing `ggplot2::`
Oct 8, 2019
d3647e0
Remove reference files after use to try to reduce memory usage
Oct 8, 2019
435c517
Make one big mutate
Oct 8, 2019
aa466e0
Dumb extra comma
Oct 8, 2019
2ef5558
Add VAF_FILTER option and it's circle CI component
Oct 9, 2019
d2330fd
get rid of typo
Oct 9, 2019
ca4414f
Re-fix Circle CI file
Oct 9, 2019
c123506
Fix WXS if statement
Oct 9, 2019
3faab85
Fix default Circle CI option
Oct 9, 2019
1912d0e
Make indels come last in the barplot
Oct 9, 2019
9eaa124
Fix order of barplot graph
Oct 9, 2019
251366a
Merge branch 'master' into cansav09/snv-add-eval
cansavvy Oct 9, 2019
e5dee1d
Add the comparison notebook
Oct 9, 2019
ccc0c27
Merge branch 'cansav09/snv-add-eval' into snv-comparison
Oct 9, 2019
91c8d99
Add initial comparison notebook
Oct 9, 2019
905bb2a
Notebook adjustments
Oct 17, 2019
ba40502
Fixed dumb label mix-up problem
Oct 17, 2019
1919b05
Further honing things
Oct 17, 2019
0470405
Added the other plots except that one combo plot
Oct 18, 2019
89df186
Add rendered version too
Oct 18, 2019
a1fdac4
Rough draft of all plots here
Oct 18, 2019
eb5ffa1
Get rid of upsettR's blank plot
Oct 18, 2019
288fca2
Refresh the notebook
Oct 18, 2019
0ab7171
Run linter
Oct 18, 2019
5a73511
Merge remote-tracking branch 'origin/snv-comparison' into snv-comparison
Oct 18, 2019
8a942fe
Change name to reflect the data better
Oct 18, 2019
55607e7
Add to CircleCI and Dockerfile
Oct 18, 2019
595bf1e
Merge branch 'master' into snv-comparison
cansavvy Oct 18, 2019
0e186ab
Try to fix busted CircleCI command
Oct 18, 2019
eef394f
Merge remote-tracking branch 'origin/snv-comparison' into snv-comparison
Oct 18, 2019
97815dc
Attempt to fix Dockerfile build problem
Oct 18, 2019
465c69a
Attempt to fix docker build
Oct 21, 2019
d77cf90
Some Docker probs fixed
Oct 21, 2019
b242df2
Push latest notebook and Dockerfile
Oct 21, 2019
e7c3598
Dockerfile appears to be building right.
Oct 21, 2019
bd462d3
Couple more things to finish up Dockerfile adds
Oct 21, 2019
86282aa
development thing was left
Oct 21, 2019
a483e11
Add Template for results discussion
Oct 22, 2019
38f0751
Updates to notebook. Function and etc.
Oct 22, 2019
1e02b71
Refresh notebook
Oct 22, 2019
f836095
Add png saves to each plot
Oct 22, 2019
cd2643a
Added options to calculate_vaf_tmb.R script
Oct 22, 2019
ed4d8f9
Add set up for vaf_filter experiment
Oct 22, 2019
b9d2519
Push separate notebook for vaf_filter exp
Oct 23, 2019
5617f44
Changes to README
Oct 23, 2019
fc5dba9
Merge branch 'master' into snv-comparison
cansavvy Oct 23, 2019
8486fd1
Add more documentation
Oct 23, 2019
cbffd8e
Push VAF cutoff experiment notebook
Oct 23, 2019
d11223c
Update title of vaf notebook
Oct 23, 2019
334e365
Update notebook
Oct 23, 2019
4690bc4
Fix path problem and also extend no_region option
Oct 23, 2019
dbdb9c4
Merge remote-tracking branch 'upstream/master' into snv-comparison
Oct 23, 2019
70e8c35
Fix typo
Oct 23, 2019
518c0d6
Added some documentation
Oct 23, 2019
b23dafa
Prep consensus mutation file saving
Oct 23, 2019
ae6f9fa
Merge remote-tracking branch 'upstream/master' into snv-comparison
Oct 24, 2019
fcd8436
Fix sex chr mislabeling in COSMIC file per @jashapiro 's suggestion
Oct 24, 2019
880d1f0
Get rid of regional analysis. Linter the notebooks Add changes to plots
Oct 24, 2019
6d01ac7
Updated COSMIC file
Oct 24, 2019
a4a6867
Streamline this branch
Oct 25, 2019
5c5eda7
Revert unneeded changes
Oct 25, 2019
b6f86b7
Clean up README
Oct 25, 2019
ccc2d7c
Add --no_region option to README
Oct 25, 2019
2813d8e
Merge remote-tracking branch 'jaclyn-taroni/master' into snv-caller_r…
Oct 25, 2019
c706e16
Merge branch 'snv-caller_revamp' into compare_callers_nb
Oct 25, 2019
a8e10c6
Update CircleCI
Oct 25, 2019
57e3e2f
Get rid of region analysis to save on memory usage
Oct 25, 2019
93ff0e0
Add documentation about comparison to README
Oct 25, 2019
271dc2b
Merge remote-tracking branch 'upstream/master' into compare_callers_nb
Oct 25, 2019
c167dfb
Merge remote-tracking branch 'upstream/master' into compare_callers_nb
Oct 28, 2019
7dcd5d6
Linter and switch to rds
Oct 28, 2019
6d0a51d
Upload results
Oct 28, 2019
406e821
refresh notebook and results
Oct 28, 2019
9ce4c41
Updated notebook
Oct 28, 2019
76bca09
Change name of bash script and make it run the notebook too
Oct 28, 2019
6910580
Update README
Oct 28, 2019
49fddf7
column problem *should* be fixed
Oct 29, 2019
aa91345
Merge branch 'master' into compare_callers_nb
cansavvy Oct 29, 2019
b66fb62
Push refreshed notebook and results
Oct 29, 2019
38d220d
Merge remote-tracking branch 'origin/compare_callers_nb' into compare…
Oct 29, 2019
a5fdddc
Add a bit more info in the README
Oct 29, 2019
d635d06
Merge branch 'master' into compare_callers_nb
cansavvy Oct 29, 2019
609b319
attempt to fix CircleCI's lack of grid package
Oct 29, 2019
80c920c
Add grid to Dockerfile explicitly
Oct 29, 2019
30584b0
Made some things a tad less janky
Oct 29, 2019
7153398
Push refreshed notebook. Get rid of grid install from Dockerfile
Oct 29, 2019
38c254e
Try older version of UpSetR for shiggles
Oct 29, 2019
45e9834
Get rid of `library(grid)` so we know if the version of UpSetR has it…
Oct 29, 2019
29381a0
Test CRAN installation fo UpSetR
Oct 29, 2019
d6a8af1
beginning to reformulate script organization for consensus file creation
Oct 29, 2019
9a11173
Push reorganized scripts and notebooks
Oct 30, 2019
1bccb5c
renaming and reorganzing and styling
Oct 30, 2019
6961855
suppressWarnings for coercing variables. We expect that
Oct 30, 2019
912f182
get rid of some development remnants
Oct 30, 2019
3bbafa0
Streamline the PR
Oct 30, 2019
574e1e4
Get rid of dummy file
Oct 30, 2019
2c3ea05
Streamline even more
Oct 30, 2019
d71d15b
Get rid of superfluous pound signs
Oct 30, 2019
12bb2ca
Actually add merge callers script
Oct 30, 2019
35f7b70
get rid of outdated source(functions) command
Oct 30, 2019
c192773
Make --overwrite actually functional
Oct 30, 2019
f8d7a66
re-run linter after those adds
Oct 30, 2019
e8e9460
Merge branch 'master' into merge_callers_script
cansavvy Oct 30, 2019
3cb1245
Get rid of outdated `-w`
Oct 30, 2019
f13bea0
Merge remote-tracking branch 'origin/merge_callers_script' into merge…
Oct 30, 2019
8d322a3
Fix example
Oct 30, 2019
8d55eb9
Found an outdated comment
Oct 30, 2019
a323dfa
Address the outdated comments spotted by @jaclyn-taroni
Oct 30, 2019
b61d162
Merge remote-tracking 'upstream/master' into merge_callers_script
jaclyn-taroni Oct 30, 2019
acab221
Make sure #190 changes are included
jaclyn-taroni Oct 30, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions analyses/snv-callers/run_caller_analysis.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,9 @@ do
--no_region \
--overwrite
done

######################## Plot the data and create reports ######################
for dataset in ${datasets[@]}
do
do
echo "Processing dataset: ${dataset}"
Rscript analyses/snv-callers/scripts/02-run_eval.R \
--label ${dataset} \
Expand All @@ -63,4 +62,10 @@ for dataset in ${datasets[@]}
--cosmic $cosmic \
--strategy wgs,wxs,both \
--no_region
done
done
##################### Merge callers' files into total files ####################
Rscript analyses/snv-callers/scripts/03-merge_callers.R \
--vaf analyses/snv-callers/results \
--output analyses/snv-callers/results/consensus \
--file_format $format \
--overwrite
304 changes: 304 additions & 0 deletions analyses/snv-callers/scripts/03-merge_callers.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,304 @@
# Merge the caller VAF and TMB files
#
# 2019
#
# C. Savonen for ALSF - CCDL
#
# Purpose: Merge callers' TMB and VAF files into total files with a column `caller`
# to designate their origin.

# Files Output:
# "all_callers_vaf.<file_format>" - contains all the VAF file information for all callers.
# "all_callers_tmb.<file_format>" - contains all the TMB file information for all callers.
# "mutation_id_list.<file_format>" - a full list of the mutations that can be
# used for an UpSetR graph
# "callers_per_mutation.<file_format>" - contains a breakdown for each mutation of what callers
# called it. Will be used to identify the consensus mutations.

# Option descriptions
# --vaf : Parent folder containing the vaf and tmb files for each folder.
# <caller_name>_vaf.<file_format>
# <caller_name>_tmb.<file_format>
# --file_format: What type of file format were the vaf and tmb files saved as? Options are
# "rds" or "tsv". Default is "rds".
# --output : Where you would like the output from this script to be stored.
# --overwrite : If TRUE, will overwrite any reports of the same name. Default is
# FALSE
#
#
# Command line example:
#
# Rscript 03-merge_callers.R \
# -v results \
# -o strelka2 \
# -s wxs \
# -w
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this -w correspond to?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used to have --overwrite have a short flag but then I deleted it. Forgot to delete it from the example.

#
# Establish base dir
root_dir <- rprojroot::find_root(rprojroot::has_dir(".git"))

# Magrittr pipe
`%>%` <- dplyr::`%>%`

# Load library:
library(optparse)

#--------------------------------Set up options--------------------------------#
# Set up optparse options
option_list <- list(
make_option(
opt_str = c("-v", "--vaf"), type = "character",
default = NULL, help = "Path to folder with the output files
from 01-calculate_vaf_tmb. Should include the VAF, TMB, and
region TSV files",
metavar = "character"
),
make_option(
opt_str = c("-f", "--file_format"), type = "character", default = "rds",
help = "What type of file format were the vaf and tmb files saved as?
Options are 'rds' or 'tsv'. Default is 'rds'.",
metavar = "character"
),
make_option(
opt_str = c("-o", "--output"), type = "character",
default = NULL, help = "Path to folder where you would like the
output from this script to be stored.",
metavar = "character"
),
make_option(
opt_str = c("--overwrite"), action = "store_true",
default = FALSE, help = "If TRUE, will overwrite any reports of
the same name. Default is FALSE",
metavar = "character"
)
)

# Parse options
opt <- parse_args(OptionParser(option_list = option_list))

########################### Check options specified ############################
# Bring along the file suffix. Make to lower.
file_suffix <- tolower(opt$file_format)

# Check that the file format is supported
if (!(file_suffix %in% c("rds", "tsv"))) {
warning("Option used for file format (-f) is not supported. Only 'tsv' or 'rds'
files are supported. Defaulting to rds.")
opt$file_format <- "rds"
file_suffix <- "rds"
}

# Normalize this file path
opt$vaf <- file.path(root_dir, opt$vaf)

# Check the output directory exists
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
if (!dir.exists(opt$vaf)) {
stop(paste("Error:", opt$vaf, "does not exist"))
}

# Exclude the non-caller directories
caller_dirs <- grep("vaf_cutoff|consensus",
dir(opt$vaf, full.names = TRUE),
invert = TRUE,
value = TRUE
)

# Print this out to check
message("Will merge all VAF and TMB files in these folders: \n", paste0(caller_dirs, "\n"))

# Get a list of vaf files
vaf_files <- sapply(caller_dirs,
list.files,
pattern = paste0("_vaf.", file_suffix),
recursive = TRUE, full.names = TRUE
)

# Print this out to check
message("Merging these VAF files: \n", paste0(vaf_files, "\n"))

# Get a list of tmb files
tmb_files <- sapply(caller_dirs,
list.files,
pattern = paste0("_tmb.", file_suffix),
recursive = TRUE, full.names = TRUE
)

# Print this out to check
message("Merging these TMB files: \n", paste0(tmb_files, "\n"))

################################### Set Up #####################################
# Set and make the plots directory
opt$output <- file.path(root_dir, opt$output)

# Make caller specific plots folder
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
if (!dir.exists(opt$output)) {
dir.create(opt$output, recursive = TRUE)
}

# Declare output file paths
all_vaf_file <- file.path(opt$output, "all_callers_vaf.rds")
all_tmb_file <- file.path(opt$output, "all_callers_tmb.rds")
mut_id_file <- file.path(opt$output, "mutation_id_list.rds")
call_per_mut_file <- file.path(opt$output, "callers_per_mutation.rds")

##################### Check for files if overwrite is FALSE ####################
# If overwrite is set to FALSE, check if these exist before continuing
if (!opt$overwrite) {
# Make a list of the output files
output_files <- c(all_vaf_file, all_tmb_file, mut_id_file, call_per_mut_file)

# Find out which of these exist
existing_files <- file.exists(output_files)

# If all files exist; stop
if (all(existing_files)) {
stop(cat(
"Stopping; --overwrite is not being used and all output files already exist
in the designated --output directory."
))
}
# If some files exist, print a warning:
if (any(existing_files)) {
warning(cat(
"Some output files already exist and will not be overwritten unless you use --overwrite: \n",
paste0(output_files[which(existing_files)], "\n")
))
}
}

########################### Make Master VAF file ###############################
# If the file exists or the overwrite option is not being used, do not write the
# merged VAF file.
if (file.exists(all_vaf_file) && !opt$overwrite) {
# Stop if this file exists and overwrite is set to FALSE
warning(cat(
"The merged VAF file already exists: \n",
all_vaf_file, "\n",
"Use --overwrite if you want to overwrite it."
))
} else {
# Get the caller names
caller_names <- stringr::word(vaf_files, sep = "/", -2)

# Read in vaf files for all callers
if (opt$file_format == "tsv") {
vaf_list <- lapply(vaf_files, readr::read_tsv)
} else {
vaf_list <- lapply(vaf_files, readr::read_rds)
}

# Read in the other files to match the first
vaf_list <- lapply(vaf_list, function(df) {
# Make it so it is more easily combined with the other files
df %>%
# Attempt to make numeric columns where that doesn' kick back an "NA"
dplyr::mutate_at(dplyr::vars(which(!is.na(as.numeric(t(df[1, ]))))), as.numeric) %>%
# Aliquot id sometimes contains letters and sometimes numbers across the callers
dplyr::mutate(
aliquot_id = as.character(aliquot_id),
variant_qual = as.character(variant_qual)
) %>%
# Turn these columns into characters because otherwise they cause trouble.
dplyr::mutate_at(dplyr::vars(dplyr::contains("AF", ignore.case = FALSE)), as.character) %>%
# Get rid of the few if any duplicate entries.
dplyr::distinct(mutation_id, .keep_all = TRUE)
})

# Carry over the callers' names
names(vaf_list) <- caller_names

# Print progress message
message("Saving master VAF file to: \n", all_vaf_file)

# Combine and save VAF file
vaf_df <- suppressWarnings(dplyr::bind_rows(vaf_list, .id = "caller")) %>%
cansavvy marked this conversation as resolved.
Show resolved Hide resolved
dplyr::mutate(caller = factor(caller)) %>%
# Write to RDS file
readr::write_rds(all_vaf_file)
}
########################### Make Master TMB file ###############################
# If the file exists or the overwrite option is not being used, do not write the
# merged TMB file.
if (file.exists(all_tmb_file) && !opt$overwrite) {
# Stop if this file exists and overwrite is set to FALSE
warning(cat(
"The merged TMB file already exists: \n",
all_tmb_file, "\n",
"Use --overwrite if you want to overwrite it."
))
} else {
if (opt$file_format == "tsv") {
tmb_list <- lapply(tmb_files, readr::read_tsv)
} else {
tmb_list <- lapply(tmb_files, readr::read_rds)
}

# Carry over the callers' names
names(tmb_list) <- caller_names

# Print progress message
message("Saving master TMB file to: \n", all_tmb_file)

# Combine and save TMB file
tmb_df <- dplyr::bind_rows(tmb_list, .id = "caller") %>%
dplyr::mutate(caller = factor(caller)) %>%
readr::write_rds(all_tmb_file)
}
############################# Make mutation id list ############################
# If the file exists or the overwrite option is not being used, do not write mutation id file.
if (file.exists(mut_id_file) && !opt$overwrite) {
# Stop if this file exists and overwrite is set to FALSE
warning(cat(
"The mutation id list file already exists: \n",
mut_id_file, "\n",
"Use --overwrite if you want to overwrite it."
))
} else {
mutation_id_list <- lapply(vaf_list, function(caller) caller$mutation_id)

# Print progress message
message("Saving: \n", mut_id_file)

readr::write_rds(mutation_id_list, mut_id_file)
}
############################# Callers per mutation df ##########################
# If the file exists or the overwrite option is not being used, do not write the
# callers per mutation file.
if (file.exists(call_per_mut_file) && !opt$overwrite) {
# Stop if this file exists and overwrite is set to FALSE
warning(cat(
"The mutation id list file already exists: \n",
call_per_mut_file, "\n",
"Use --overwrite if you want to overwrite it."
))
} else {
# Make a string that says what callers call each mutation
callers_per_mutation <- tapply(vaf_df$caller,
vaf_df$mutation_id,
paste0,
collapse = "-"
) %>%
# Make into a data.frame
as.data.frame() %>%
tibble::rownames_to_column("mutation_id")

# Obtain the median VAF for each mutation
vaf_med <- tapply(
vaf_df$vaf,
vaf_df$mutation_id,
median
) %>%
# Make into a data.frame
as.data.frame() %>%
tibble::rownames_to_column("mutation_id")

# Print progress message
message("Saving: \n", call_per_mut_file)

# Join the median VAF and the callers that call that mutation into one data.frame
callers_per_mutation <- callers_per_mutation %>%
dplyr::inner_join(vaf_med, by = "mutation_id") %>%
# Make column names more sensible
dplyr::rename(caller_combo = "..x", median_vaf = "..y") %>%
readr::write_rds(call_per_mut_file)
}