Nextflow Pipelines

Bulk RNA-seq

We are recommending the popular nextflow pipeline for RNA sequencing analysis pipeline using STAR, HISAT2 and Salmon with gene counts and quality control.

Useful links:

Nextflow RNA-seq main: https://nf-co.re/rnaseq
Nextflow RNA-seq usage: https://nf-co.re/rnaseq/docs/usage#running-the-pipeline
Nextflow configuration: https://www.nextflow.io/docs/latest/config.html
Nextflow pipelines: https://nf-co.re/pipelines
Solution to SGE using the wrong shell: https://github.com/nextflow-io/nextflow/issues/21

Installation and test run

Log into wynton

ssh user@log2.wynton.ucsf.edu

ssh into a dev node

ssh dev2

From your home directory, download nextflow

curl -fsSL get.nextflow.io | bash

Make a bin directory (if you haven't already) and move nextflow there

mkdir bin
mv nextflow ~/bin/

Create a nextflow configuration file to specify SGE settings

printf 'process.executor = "sge"\nprocess.penv = "smp"\nprocess.clusterOptions = "-S /bin/bash"' > .nextflow/config

Run the nextflow test pipeline specifying the singularity profile. The console will display the progress in realtime. A warning message will appear during the first run regarding the automatic creation of a singularity cache directory.

nextflow run nf-core/rnaseq -profile test,singularity

The output be in the results directory. Pipeline reports are in results/pipeline_info/. Note: if you get an error, try running it a second time.

ls results/
ls results/pipeline_info

Custom runs

Now you can setup and run the pipeline on your own data with step like the following:

Copy your fastq files over to wynton (see How to move data)
Specify max_memory, genome, reads and optional skip* arguments in the command (see docs on reads, genome and many others args that considered carefully)

nextflow run nf-core/rnaseq --max_memory '8.GB' --skipBiotypeQC --skipFastQC --skipTrimming --genome GRCh38 --reads '*_R{1,2}.fastq.gz' -profile singularity

Pro-tips:

Review the execution_report.html to determine the necessary max_memory value for your analysis.
You may want to use screen or tmux to manage longer runs.

Other analyses

See other nextflow pipelines: https://nf-co.re/pipelines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nextflow Pipelines

Bulk RNA-seq

Installation and test run

Custom runs

Other analyses

Documentation

Clone this wiki locally