Name		Name	Last commit message	Last commit date
Latest commit History 223 Commits
.github		.github
additional_files		additional_files
assets		assets
bin		bin
conf		conf
lib		lib
modules/local		modules/local
subworkflows		subworkflows
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.gitpod.yml		.gitpod.yml
.nf-core.yml		.nf-core.yml
.prettierignore		.prettierignore
.prettierrc.yml		.prettierrc.yml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CITATIONS.md		CITATIONS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
pyproject.toml		pyproject.toml
sample_sheet.csv		sample_sheet.csv

Repository files navigation

SkewX

Introduction

SkewX is a nextflow pipeline to measure skewed X inactivation from long-read sequencing of native DNA, either with Pacbio or Nanopore or technologies. It starts from bam files that include modified basecalls for 5mCG. It first calls heterozygous variants with DeepVariant and phases them into haplotypes with WhatsHap. Then it also clusters reads based on their methylation profile over CpG islands, and pools this haplotype and epiallele information to measure the skew in X inactivation.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

Pipeline summary

The required input is modbam files with 5mCG information. Then:

If the reads are not already aligned, align to the reference genome with 'Minimap2'
If multiple samples per individual are present, for instance multiple tissues, merge them into a single bam file
Call variants with 'DeepVariant'
Phase SNPs with 'WhatsHap'
Haplotype and tag reads with 'WhatsHap'
Cluster reads based on methylation profile with 'NanoMethViz'
Measure skew in X inactivation and generate a report for each individual.

Quick Start

Install or module load Nextflow (>=21.10.3)
Install any of Docker, Singularity (you can follow this tutorial), Podman, Shifter or Charliecloud for full pipeline reproducibility (you can use Conda both to install Nextflow itself and also to manage software within pipelines. Please only use it within pipelines as a last resort; see docs).
IMPORTANT - ensure you mount singularity to your home directory (include "export NXF_SINGULARITY_HOME_MOUNT=true" in your .bashrc or to your session environment before launching pipeline - by default Singularity will not be able to find your home)
Ensure required files (.bed files, .fa reference) are properly specified as parameters in the config (nextflow.config)

Start running your own analysis!

nextflow main.nf --input samplesheet.csv --outdir skew_results --fasta chm13v2.0.fa --cgi CGIs_CHM13v2_chrX.bed -profile singularity

Documentation

Example data

An example dataset is available in the test_data directory of this repository. The dataset contains a small region of the mouse X chromosome, with a BAM file with methylation information. The pipeline can be run on this dataset with the following command:

nextflow main.nf --input test_data/samplesheet.csv --outdir skew_test_results --fasta test_data/mm10_chrX.fa --cgi test_data/mm10_chrX_CGI.bed -profile test

Credits

SkewX was originally written by Quentin Gouil, James Lancaster and Ed Yang.

We thank the following people for their extensive assistance in the development of this pipeline:

Kathleen Zeglinski for her superior nextflow expertise
Shian Su for implementing new features in NanoMethViz

Citations

If you use SkewX for your analysis, please cite it using the following doi: 10.1101/2024.03.20.585856

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkewX

Introduction

Pipeline summary

Quick Start

Documentation

Example data

Credits

Citations

About

Releases 2

Packages

Contributors 2

Languages

License

QGouil/SkewX

Folders and files

Latest commit

History

Repository files navigation

SkewX

Introduction

Pipeline summary

Quick Start

Documentation

Example data

Credits

Citations

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages