This repository has been archived by the owner on May 2, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Nextflow Pipelines
Alexander Pico edited this page Aug 4, 2020
·
2 revisions
We are recommending the popular nextflow pipeline for RNA sequencing analysis pipeline using STAR, HISAT2 and Salmon with gene counts and quality control.
Useful links:
- Nextflow RNA-seq main: https://nf-co.re/rnaseq
- Nextflow RNA-seq usage: https://nf-co.re/rnaseq/docs/usage#running-the-pipeline
- Nextflow configuration: https://www.nextflow.io/docs/latest/config.html
- Nextflow pipelines: https://nf-co.re/pipelines
- Solution to SGE using the wrong shell: https://github.com/nextflow-io/nextflow/issues/21
- Log into wynton
ssh user@log2.wynton.ucsf.edu
- ssh into a dev node
ssh dev2
- From your home directory, download nextflow
curl -fsSL get.nextflow.io | bash
- Make a bin directory (if you haven't already) and move nextflow there
mkdir bin
mv nextflow ~/bin/
- Create a nextflow configuration file to specify SGE settings
printf 'process.executor = "sge"\nprocess.penv = "smp"\nprocess.clusterOptions = "-S /bin/bash"' > .nextflow/config
- Run the nextflow test pipeline specifying the singularity profile. The console will display the progress in realtime. A warning message will appear during the first run regarding the automatic creation of a singularity cache directory.
nextflow run nf-core/rnaseq -profile test,singularity
- The output be in the results directory. Pipeline reports are in results/pipeline_info/. Note: if you get an error, try running it a second time.
ls results/
ls results/pipeline_info
Now you can setup and run the pipeline on your own data with step like the following:
- Copy your fastq files over to wynton (see How to move data)
- Specify
max_memory
,genome
,reads
and optionalskip*
arguments in the command (see docs on reads, genome and many others args that considered carefully)
nextflow run nf-core/rnaseq --max_memory '8.GB' --skipBiotypeQC --skipFastQC --skipTrimming --genome GRCh38 --reads '*_R{1,2}.fastq.gz' -profile singularity
Pro-tips:
- Review the execution_report.html to determine the necessary max_memory value for your analysis.
- You may want to use screen or tmux to manage longer runs.
- See other nextflow pipelines: https://nf-co.re/pipelines