JAVA framework for accurate SNV assessment
JACUSA1 has been deprecated and superseded with JACUSA2. Likewise JACUSAhelper has been deprecated and replaced with JACUSA2helper. Please change to the new versions of the software.
Find source code and tools in the following sub-directories of the repository:
- src/ The main Java source code for JACUSA
- manual/manual.pdf The manual for JACUSA
- JacusaHelper R package to process JACUSA output file(s)
- tools/AddVariants Java tool to implant variants into BAM file
JACUSA has been developed and tested with Java v1.8.
IMPORTANT! Stranded paired-end data are handled properly with JACUSA v1.2.0 and higher. DO NOT USE JACUSA v1.0.1 on stranded paired-end data!
Get the current Jacusa JAR:
https://github.com/dieterich-lab/JACUSA/releases/download/v1.3.5/JACUSA_v1.3.5.jar
v1.3.0
- Fixed an issue with call-1 and stranded reads. JACUSA now requires BAM files to contain MD field.
- Changed calculation of up/downstream matches between filtered features
- Changed Homopolymer calculation -> simple homopolymer within alignment blocks (insertions and introns currently ignored)
v1.2.3
- Fixed pileup filter to maintain orientation for all library type combinations
- Fixed typo in library type: FR_FIRSTSTRAND -> RF_FIRSTSTRAND
v1.2.0
- Added support for stranded paired end reads - parameter -P changed
- Added support for single sample mode
- Added -R | --SHOW-REF option
- Minor fixes / typos
v1.0.1
- Minor fixes / typos.
DO NOT USE JACUSA v1.0.1 on stranded paired-end data!
https://github.com/dieterich-lab/JACUSA/raw/master/build/JACUSA_v1.0.1.jar
Since v1.2 the format of -P has changed! The format has been inspired by tophat's http://ccb.jhu.edu/software/tophat/manual.shtml library type parameter. With the command line parameter -P,--build-pileup the user can choose from combinations of:
- FR-FIRSTSTRAND STRANDED library - first strand sequenced,
- FR-SECONDSTRAND STRANDED library - second strand sequenced, and
- UNSTRANDED UNSTRANDED library.
Available methods for JACUSA $ java -jar jacusa.jar [ENTER]
:
- call-1 Call variants - one sample
- call-2 Call variants - two samples
- pileup SAMtools like mpileup for two samples
General command line structure for variant calling call-1:
jacusa.jar call-2 [OPTIONS] BAM1_1[,BAM1_2,BAM1_3,...]
Get available options:
java -jar jacusa.jar call-1
General command line structure for variant calling call-2:
jacusa.jar call-2 [OPTIONS] BAM1_1[,BAM1_2,BAM1_3,...] BAM2_1[,BAM2_2,BAM2_3,...]
Get available options:
java -jar jacusa.jar call-2
Download and extract sample data
# goto to https://data.dieterichlab.org/s/hg19_chr1_gDNA_VS_cDNA
# download hg19_chr1_gDNA_VS_cDNA.tar.gz
# and unpack with
tar xzvpf hg19_chr1_gDNA_VS_cDNA.tar.gz
Call RNA-DNA differences (RDDs) by comparing gDNA and cDNA in sample data and save results in rdds.out.
$ java -jar call-2 -P UNSTRANDED,FR-FIRSTSTRAND -a H,M,B,Y -f 1024 -T 2.3 -p 2 -r rdds.out gDNA.bam cDNA1.bam,cDNA2.bam
Read, Process, and write JACUSA output files
Download JacusaHelper:
$ wget https://github.com/dieterich-lab/JACUSA/raw/master/JacusaHelper/build/JacusaHelper_0.43.tar.gz
Install JacusaHelper in R:
install.packages("JacusaHelper_0.43.tar.gz")
library("JacusaHelper")
Load JacusaHelper package in R:
library("JacusaHelper")
Read JACUSA output, filter sites where the variant base is NOT present in all replicates of at least one sample, and finally add editing frequency info:
# Read Jacusa output and filter by test-statistic >= 1.56 and
# ensure that site have at least 10 reads in (cov1) sample 1 and at least 5 reads per replicate in (covs2) sample 2
data <- Read("Jacusa_RDD.out, stat = 1.56, fields = c("cov1", "covs2"), cov = c(10, 5))
# This ensures that the variant base is present in all replicates of at least one sample
data <- FilterResult(data)
# This is only applicable for RDD calls and it will calculate their editing frequency.
# It is expected that gDNA is stored as sample 1!
data <- AddEditingFreqInfo(data)
Plot base change conversion:
# Among other additional infos, AddEditingFreqInfo will populate baseChange field in data
tbl <- table(data$baseChange)
barplot(tbl)
Check documentation in R for more details
?JacusaHelper
Add variants to a BAM file
Get the current AddVariants JAR:
$ wget https://github.com/dieterich-lab/JACUSA/raw/master/tools/AddVariants/build/AddVariants_v0.3.jar
Implant variants defined in <input.bam>
into <variants.bed>
and write results to <output.sam>
:
java -jar AddVariants.jar <input.bam> <variants.bed> | samtools view -Sb - > <output.sam>
chr | start | end
see LICENSE file