Skip to content

jIskCoder/malaria-variant-calling

 
 

Repository files navigation

Malaria (Plasmodium Vivax) example pipeline

This is an example BioNix pipeline for variant calling in a target region over 5k publicly available Plasmodium Vivax sequencing samples. The data processing aligns each sample against the reference genome, sorts the reads, then calls variants on the population using GATK. Copy number is called independently on each sample using QDNAseq.

Notable design choices:

  • there are a lot of samples, so fetchurl is overridden in nixpkgs to prevent substitutions, which avoids querying the cache for each sequence;
  • similarly, BioNix's stage is overridden to prevent substitutions.

Building/Executing

This repository is configured as a Nix flake and can be built with:

nix build github:jbedo/malaria-variant-calling

As the full workflow runs over a significant number of sequences this requires a large amount of space and computational time. To enable testing/experimentation the workflow can be run on a small subset of the data with:

nix build github:jbedo/malaria-variant-calling#small

Slurm cluster execution

The repository is also setup to exercise experimental Slurm support patches. If building on slurm then using nix built with the patch should be sufficient to submit the jobs to the queue.

About

bionix malaria variant clling workflow

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nix 100.0%