Skip to content

Pipeline for pharmacogenomics research based on PyPGx package

Notifications You must be signed in to change notification settings

LlaneroHiboreo/nextflow_pharmacogenomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nextflow_pharmacogenomics

Pipeline for pharmacogenomics research based on PyPGx package

Pipeline Structure

Overview

This document describes the usage of a Nextflow workflow designed for running pharmacogenomics analysis. This workflow is configured to be run with Docker or Singularity containers, ensuring reproducibility across different computing environments.

Prerequisites

  • Nextflow >=22.10.1
  • Singularity
  • Docker

Installation

Clone this repository:

git clone this repository URL
cd  nextflow_pharmacogenomics

Parameters Description

  • --outdir results: Specifies the output directory where the results will be saved.
  • --input Path to the input CSV file containing sample information. The format should include columns for sample, BAM file, and BAM index file.
  • --fasta Path to the reference genome in FASTA format.
  • --panel_files Path to the VCF files used for variant calling or analysis.
  • --panel_indexes Path to the index files for the VCF files.
  • --star_alleles_list [list of genes]: Specifies a list of genes for which star alleles are to be called.
  • --no_star_alleles_list [list of genes]: Specifies a list of genes for which star alleles are not to be called.
  • --cnv_caller_files Path to CNV caller files, typically in ZIP format.

The VCF Panel files and CNV caller files can be obtained from here.The list of alleles to use against the analysis can be found here

Running the Workflow

To successfully run the workflow, different input parameters are required:

nextflow run main.nf \
-c custom.config \
-profile singularity \
--outdir results \
--input assets/samplesheet.csv \
--fasta /path/to/reference.fasta \
--panel_files '/path/to/pypgx-bundle/1kgp/GRCh38/*.vcf.gz' \
--panel_indexes '/path/to/pypgx-bundle/1kgp/GRCh38/*.vcf.gz.tbi' \
--star_alleles_list CYP2A6,CYP2B6,CYP2D6,CYP2E1,CYP4F2,G6PD,GSTM1,SLC22A2,SULT1A1,UGT1A4,UGT2B15,UGT2B17 \
--no_star_alleles_list ABCB1,ABCG2,CACNA1S,CFTR,CYP1A1,CYP1A2,CYP1B1,CYP2A13,CYP2C8,CYP2C9,CYP2C19,CYP2F1,CYP2J2,CYP2R1,CYP2S1,CYP2W1,CYP3A4,CYP3A5,CYP3A7,CYP3A43,CYP4A11,CYP4A22,CYP4B1,CYP17A1,CYP19A1,CYP26A1,DPYD,F5,GSTP1,IFNL3,NAT1,NAT2,NUDT15,POR,PTGIS,RYR1,SLC15A2,SLCO1B1,SLCO1B3,SLCO2B1,TBXAS1,TPMT,UGT1A1,UGT2B7,VKORC1,XPC \
--cnv_caller_files '/path/to/PYPGX/pypgx-bundle/cnv/GRCh38/*.zip' \
-w work

Input Files

The format of the input file (samplesheet.csv) should contain the following header:

  • sample: name of the experiment
  • bam: path to the aligned read
  • bai: path to the indexed aligned read

About

Pipeline for pharmacogenomics research based on PyPGx package

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published