Skip to content

natforsdick/weta-genome-assembly

Repository files navigation

Wētā genome assembly

All scripts related to genome assembly for Little Barrier giant wētā (Deinacrida heteracantha). This project is a Genomics Aotearoa & Manaaki Whenua - Landcare Research collaboration led by Thomas Buckley. Contributors include Manpreet Dhami, Ann McCartney, Dukchul Park, Dini Senanayake, Natalie Forsdick.

Input data: PacBio CLR 'HiFi-like', Illumina HiSeq, Hi-C.

01-assembly - Testing hifiasm, HiCanu, and MaSuRCA to assemble PacBio 'HiFi-like' data. To date (2022-07-04) only hifiasm has completed.

02-purge-dups - Implementing the purge-dups pipeline to remove duplicates.

03-scaffolding - Using the Dovetail Omni-C pipeline and YaHS to scaffold the assembly.

Subsequent steps are all just exploratory at this point.

04-fill-polish - Exploring the use of gap-filling tools to improve the scaffolded assembly.

05-alignment - Aligning draft genomes against one another for comparisons.

06-read-correction - Attempting correction of raw PacBio CLR reads using FMLRC2, with the intention of using a draft assembly as a 'scaffold' on which to assemble these reads against.

QC - Contains scripts for raw read QC and assembly QC - including tools fastqc, Hi-C and Pore-C QC, BUSCO5, Merqury.

About

All scripts related to genome assembly for giant wētā

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages