Bioinformatic scripts/code for Llewellyn et al. (2023) First Whole-Genome Sequence and Flow Cytometry Genome Size Data for the Lichen-Forming Fungus Ramalina farinacea (Ascomycota). Genome Biology and Evolution 15(5), evad074 doi:10.1093/gbe/evad074
All bash scripts were run on the Imperial College London High Performance Computer except the Functional Annotation section (details below). This HPC uses the PBS queueing system, therefore core/RAM/runtimes in .sh scripts are specified in PBS format. All scripts are written for a single genome file.
Order of analyses:
2. Mycobiont filtering rounds 1 and 2
cd basecalling-assembly
qsub guppy.sh
qsub flye.sh
cd mycobiont-filtering
The following commands extract Ascomycota reads belonging to the mycobiont using the BlobTools workflow.
qsub blastn.sh
qsub diamond.sh
qsub BlobTools.sh
qsub pull_Asco_reads.sh
qsub flye_Asco.sh
qsub blastn_Asco.sh
qsub diamond_Asco.sh
qsub BlobTools_Asco.sh
qsub pull_Asco_contigs.sh
cd error-correction
The following steps remove redundant contigs, error correct the remaining mycobiont reads, and produces a final blobplot for the mycobiont assembly.
qsub redundans.sh
qsub Racon.sh
qsub medaka.sh
cd mycobiont-filtering-round3
qsub blastn_Asco_round3.sh
qsub diamond_Asco_round3.sh
qsub BlobTools_Asco_round3.sh
qsub pull_Asco_contigs_round3.sh
qsub mycobiont_figure_plot.sh
cd kmer-profiling
qsub jellyfish.sh
- Kmer histogram uploaded to GenomeScope online webpage http://qb.cshl.edu/genomescope/
cd annotation
The following steps identify repeat regions de novo, soft masks them and then annotates proteins.
qsub repeatmodeler.sh
qsub repeatmasker.sh
qsub funannotate.sh
Insert structural annotation instructions here
qsub tapestry.sh
These scripts were run on Queen Mary University of London's Apocrita HPC facility which uses the Univa Grid Engine batch-queue system.
qsub antismash.sh
qsub interproscan.sh
qsub eggnogmapper.sh
qsub funannotate_annotate.sh