R script for differential gene expression (DGE) analysis with DESeq2 (Love et al, 2014) for ecotoxicological testing on zebrafish embryos. This script requires a non-normalized CountMatrix.csv and a coldata.csv file as input.
- A raw gene count matrix can be downloaded from the public ArrayExpress repository (i.e. E-MTAB-9056)
- The coldata file should contain at least the following columns:
- Condition (Treatment conditions, i.e. HighExposure, LowExposure, Control)
- Substance (Name of the tested Substance)
- Tank (Tank number from which spawning group samples were collected)
- Row names = Column names from CountMatrix.csv
- Execute this script in the same folder where files are stored
This script will run DESeq2 with pairwise Wald's t-test with IHW (Ignatiadis et al, 2016) when correcting p-values for multiple testing after Benjamini-Hochberg. Effect size shrinking is applied through apeglm (Zhu et al, 2019) and if applied, the effect size cutoff (LFcut) is determined as the 90%-quantile of absolute log2-fold changes. The output will be annotated using R's AnnotationDbi with the org.Dr.eg.db. The script is designed to analyze one tested substance at a time.
Not the most beautiful code in world but it does the job ;)