This is the supplemental software for the BMC Bioinformatics article “Network Embedded analysis of Temporal Gene Expression using Mixed Models”.
Microarrays have become a routine tool in many biological labs, resulting in large amounts of datasets containing transcription information in different mutants grown under many conditions. The data from a microarray experiment is a snapshot of the transcription at the time of measurement. Therefore, the conventional methods (log-fold changes, clustering algorithms, etc) have successfully identified transcriptional regulation. Applying these methods to time-series gene expression data inherently assumes that the data points are independent and therefore, losing the temporal evolution of gene expression. Furthermore, many of these methods do not consider the underlying interaction network of genes or proteins they encode for, that plays a key role in the expression of the genes.
This complete code along with the data is available at http://www.github.com/vjethava/NETGEMM. It can be downloaded in linux by using the command: \$ git clone git@github.com:vjethava/NETGEMM.git
The top-level directory structure is as follows:
- code: contains the matlab functions specific to the paper.
- lib: publicly available code which is used.
- lib: additional libraries.
- test: the files which run the main experiments.
- data: the original data in txt and mat formats
- scripts: the conversion scripts between txt-mat
The raw data can be found in the following files:
- aean_expr.txt: Gene expression data corresponding to experiment 1 (Aerobic and Anaerobic setting)
- exp2: This directory has the complete processed data (in matlab format) for Experiment 2, i.e., graph_pp.mat (Graph), graph.mat (Graph if taking both P-P and P-D) interactions, gene_names.txt, expr.mat (gene expression data)
The original raw data provided by biologists is given as xls and xml format.
This directory contains the following files:
- mm_svar.m : generates the comparison for multiple strains over synthetic dataset
- mm_tvar.m : generates the comparison for different number of observation time periods available for synthetic dataset
- run1.m : runs the experiment 1 for gene expression dataset 1
- run2.m runs the experiment 2 for gene expression dataset 2
This directory contains the plots for the synthetic dataset as well as the generated movies for each of the experiments based on real datasets.