Skip to content

Commit

Permalink
Fixed local build and upgraded to gradle 7. (#50)
Browse files Browse the repository at this point in the history
This included upgrading the gradle build to gradle 7, and adding the
gradle wrappers.

Note: java.sh and picard had to be updated to accommodate the fact that
we have a monolithic picard jar in the Docker image, but a lite picard
jar with separate htsjdk for local use.
  • Loading branch information
mcrusch authored Nov 20, 2023
1 parent 560b8ec commit c3f1520
Show file tree
Hide file tree
Showing 9 changed files with 398 additions and 41 deletions.
4 changes: 3 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -69,11 +69,13 @@ ENV PATH /opt/gradle/bin:${PATH}
COPY bin /tmp/xenocp/bin
COPY src /tmp/xenocp/src
COPY dependencies /tmp/xenocp/dependencies
COPY gradle /tmp/xenocp/gradle
COPY gradlew /tmp/xenocp/gradlew
COPY build.gradle /tmp/xenocp/build.gradle
COPY settings.gradle /tmp/xenocp/settings.gradle

RUN cd /tmp/xenocp \
&& gradle installDist \
&& ./gradlew installDist \
&& cp -r build/install/xenocp /opt

FROM ubuntu:20.04
Expand Down
82 changes: 49 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ XenoCP can be easily incorporated into any workflow, as it takes a BAM file
as input and efficiently cleans up the mouse contamination. The output is a clean
human BAM file that could be used for downstream genomic analysis.

## Getting started
## Quick Start

XenoCP can be run in the cloud on DNAnexus at
https://platform.dnanexus.com/app/stjude_xenocp
Expand Down Expand Up @@ -39,7 +39,38 @@ XenoCP workflow:
<!--![Alt text](images/xenocp_workflow2.png) -->
<img src="images/xenocp_workflow2.png" width="500">

## Prerequisites
## Reference Files

XenoCP performs mapping against the host genome, so it requires indexes for the
host reference genome and mapper being used.

A common use case is cleansing DNA reads with a mouse host. For this use case,
you can download the a BWA index for MGSCv37 from
http://ftp.stjude.org/pub/software/xenocp/reference/MGSCv37

To build your own reference files, first download the FASTA file for your genome
assembly. Then, create the index for your mapper:

### BWA for DNA Reads

```
$ bwa index -p $FASTA $FASTA
```

### STAR for RNA Reads

Download an annotation file such as gencode, and then run:

```
$ STAR --runMode genomeGenerate --genomeDir STAR --genomeFastaFiles $FASTA --sjdbGTFfile $ANNOTATION --sjdbOverhang 125
```

## Local Usage without Docker

### Prerequisites

First, install the following prerequisites. Note that if you are only using one
of the two mappers, bwa and STAR, you can omit the other.

* [bwa] =0.7.13
* [STAR] =2.7.1a
Expand Down Expand Up @@ -73,28 +104,25 @@ disabled.
[zlib]: https://www.zlib.net/
[sambamba]: http://lomereiter.github.io/sambamba/

### Obtain and Build XenoCP


## Local usage


### Obtain XenoCP

Clone XenoCP from GitHub:
Clone XenoCP from GitHub:
```
git clone https://github.com/stjude/XenoCP.git
```

### Build XenoCP

Once the prerequisites are satisfied, build XenoCP using Gradle.
Build XenoCP using Gradle:

```
$ gradle installDist
```

Add the artifacts under `build/install/xenocp/lib` to your Java `CLASSPATH`.
Add the artifacts under `build/install/xenocp/bin` to your `PATH`.
Add the artifacts under `build/install/xenocp` to your `PATH` and your Java `CLASSPATH`:

```
export PATH=$PATH:`pwd`/build/install/xenocp/bin
export CLASSPATH=$CLASSPATH:`pwd`/build/install/xenocp/lib/*
```

### Inputs

Expand Down Expand Up @@ -134,25 +162,8 @@ output_prefix: xenocp-
output_extension: bam
```

### Create Reference Files

Download the FASTA file for your genome assembly and run the following commands to create other files:
#### BWA reference files
```
$ bwa index -p $FASTA $FASTA
```
#### STAR reference files
In addition the genomic FASTA, STAR reference should use an annotation file (e.g. gencode).
```
$ STAR --runMode genomeGenerate --genomeDir STAR --genomeFastaFiles $FASTA --sjdbGTFfile $ANNOTATION --sjdbOverhang 125
```

[CWL inputs]: https://www.commonwl.org/user_guide/02-1st-example/index.html

### Download MGSCv37 reference files

Reference files are provided for version MGSCv37 of mouse and are available from http://ftp.stjude.org/pub/software/xenocp/reference/MGSCv37

### Run

XenoCP uses [CWL] to describe its workflow.
Expand All @@ -162,12 +173,12 @@ Then run the following.

```
$ mkdir results
$ cwltool --outdir results cwl/xenocp.cwl sample_data/input_data/inputs_local.yml
$ cwltool --preserve-environment CLASSPATH --no-container --outdir results cwl/xenocp.cwl sample_data/input_data/inputs_local.yml
```

[CWL]: https://www.commonwl.org/

## Docker
## Local Usage with Docker

XenoCP provides a [Dockerfile] that builds an image with all the included
dependencies. To use this image, install [Docker] for your platform.
Expand Down Expand Up @@ -216,6 +227,11 @@ $ docker run \
/data/inputs.yml
```

Note: when running using Singularity on an HPC, problems can arise if the
default temporary file location, /tmp, is small. To solve this, include
`-W <dir>` when executing via Singularity to redirect temp files to a
larger directory `<dir>`.

[Dockerfile]: ./Dockerfile

## Evaluate test data results
Expand Down
5 changes: 5 additions & 0 deletions bin/java.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
#!/usr/bin/env bash

# If the classpath is already set, then delegate directly to java
if [ "$CLASSPATH" != "" ]; then exec java "$@"; fi

# Otherwise, build an appropriate classpath
# This section assumes you are running inside the container
for arg in "$@"; do
case $arg in
org.stjude.compbio.*)
Expand Down
6 changes: 6 additions & 0 deletions bin/picard
Original file line number Diff line number Diff line change
@@ -1,2 +1,8 @@
#!/usr/bin/env bash

# If the classpath is already set, then invoke PicardCommandLine, allowing
# java to use the classpath.
if [ "$CLASSPATH" != "" ]; then exec java picard.cmdline.PicardCommandLine "$@"; fi

# Otherwise, use the full picard jar; this assumes you're running in the container.
java -jar /opt/picard/lib/picard*.jar "$@"
14 changes: 7 additions & 7 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,24 @@ distributions {
// Puts the compiled xenocp jar into lib
from jar
// Puts all of the compile dependencies into lib
from configurations.compile
from configurations.compileClasspath
}
}
}
}

repositories {
jcenter()
mavenCentral()
}

dependencies {
compile 'com.github.broadinstitute:picard:2.6.0'
compile 'commons-cli:commons-cli:1.2'
compile 'com.github.samtools:htsjdk:2.19.0'
implementation 'com.github.broadinstitute:picard:2.19.2'
implementation 'commons-cli:commons-cli:1.2'
implementation 'com.github.samtools:htsjdk:2.19.0'

compile files('dependencies/lib/java/xenocp-dependencies.jar')
implementation files('dependencies/lib/java/xenocp-dependencies.jar')

testCompile 'junit:junit:4.10'
testImplementation 'junit:junit:4.10'
}

group = 'org.stjude.compbio.xenocp'
Expand Down
Binary file added gradle/wrapper/gradle-wrapper.jar
Binary file not shown.
5 changes: 5 additions & 0 deletions gradle/wrapper/gradle-wrapper.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-7.2-bin.zip
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
Loading

0 comments on commit c3f1520

Please sign in to comment.