This repository provides the scripts and data used to perform the analysis of the Sgarlata et al., 2023 study entitled: "The Genomic Diversity of the Eliurus genus in northern Madagascar with a Putative New Species". If you use these scripts or workflow, please cite Sgarlata et al., 2023.
-
de novo RAD-seq data assembly in stacks: de-novo assembly of raw reads from RAD-sequencing using the stacks software. The scripts include:
- parameters' exploration for catalog building, following Paris et al., 2017 (parameter_tuning);
- catalog building (catalog_final);
- data assembly and genotype calling (final);
-
Principal Component Analysis: principal component analysis on RAD-seq genomic data for distinguishing the different Eliurus species included in the dataset.
-
ADMIXTURE analysis: This folder includes scripts for carrying out ADMIXTURE analysis and data plotting.
-
Species delimitation analyses: This forlder contains scripts for species delimitation validation:
- species delimitation analysis with guided tree (A01 method) performed in BPP (A01 analysis);
- calculation of the genealogical divergence index (gdi) based on parameters estimated in BPP (A00 analysis);
-
%GC content analysis: This script estimates the percentage of GC content for each individual in the dataset, using fasta files (obtained from stacks) imported in R.
-
Isolation-by-distance analysis: It performs the ad-hoc species delimitation test of isolation-by-distance within versus between sister-taxa.
-
mitocondrial DNA analyses: This folder includes scripts for reconstructing the Eliurus phylogenetic tree using mitochondrial cytb sequences.
- Bayesian phylogenetic inference carried out on MRBAYES (MRBAYES_mtDNA);
- Maximum-likelihhod phylogenetic inference carried out on RAXML (RAXML_mtDNA);
- Calculation of mitochondrial cytb genetic distances (mtDNA_genetic_distances);
-
Comparison mtDNA and RAD-seq genetic data: This folder includes scripts for:
- calculating nuclear (RAD-seq) genomic distances between individuals (nuclear RAD-seq genetic distances);
- comparing mitochondrial cytb and nuclear RAD-seq genetic distances (nuclear versus mtDNA genetic distances);
-
RAD-seq phylogenomic analysis - RAXML: This folder includes scripts for:
- inferring Eliurus phylogenetic relationships from concatenated nuclear RAD-seq genomic data (RAXML concatenated);
- inferring Eliurus phylogenetic relationships from partitioned nuclear RAD-seq genomic data (RAXML partitioned);
- plotting the inferred phylogenetic trees (plotting RAXML trees);
-
Genotype Likelihood analyses in ANGSD: This folder includes scripts for obtaining genotype-likelihood from raw RAD-seq data.
-
Morphological analyses: This folder includes several scripts for Eliurus morphological data analysis.
- Discriminant Analysis of Principal Components (DAPC), used for maximising differences in morphology between Eliurus species and identifying the morphological variable that best discriminate the five Eliurus species included in the dataset (DAPC analysis);
- Phylogenetic Generalized Least Squares (PGLS) analysis to measure correlations between morphological variables and 20 bioclimatic variables, while accounting for phylogenetic relatedness (PGLS analysis);