This repository contains code, benchmarking study, and illustration analysis for the R-package ragt2ridges
.
The R-package ragt2ridges
performs ridge maximum likelihood estimation of vector auto-regressive processes: the VAR(1) model (more to be added). Prior knowledge may be incorporated in the estimation through a) specification of the edges believed to be absent in the time series chain graph, and b) a shrinkage target towards which the parameter estimate is shrunken for large penalty parameter values. Estimation functionality is accompanied by methodology for penalty parameter selection.
In addition, the package offers supporting functionality for exploiting estimated models. Among others, i) a procedure to infer the support of the non-sparse ridge estimate (and thereby of the time series chain graph) is implemented, ii) a table of node-wise network summary statistics, iii) mutual information analysis, and iv) impulse response analysis.
First, the VAR(1) model, its properties, and its associated time-series chain graph are recapitulated. With this knowledge refreshed, the ridge penalized full ML estimator of the VAR(1) model is presented. The estimator is extended to allow the incorporation of prior knowledge to support both temporal and contemporaneous interactions. In both cases, memory efficient evaluation of the estimator is outlined. Cross-validation (which requires minor changes to the estimator) is described to guide the choice of the penalty parameters. Then, several strategies (e.g. selection of temporal and contemporaneous relationships, mutual information, and path analysis) for downstream exploitation of the estimated model are discussed.
The R-package ragt2ridges
performs ridge maximum likelihood estimation of vector auto-regressive processes: the VAR(1), VAR(2), fused VAR(1), and VARX(1). The estimator is extended to allow the incorporation of prior knowledge to support both temporal and contemporaneous interactions. In both cases, memory efficient evaluation of the estimator is outlined. Cross-validation (which requires minor changes to the estimator) is described to guide the choice of the penalty parameters.
In addition, the package offers supporting functionality for exploiting estimated models. Among others, i) a procedure to infer the support of the non-sparse ridge estimate is implemented, a table of node-wise network summary statistics, path analysis, mutual information analysis, and impulse response analysis.
The ridge ML estimator of the VAR(1) model is compared to its SCAD counterpart (Abegaz and Wit, 2013), which has been implemented in the R-package SparseTSCGM
, employing simulation. The two methods are compared in terms of squared Frobenius loss of the estimates and for sensitivity and specificity of their edge selection of the time-series chain graph.
In all, the proposed ridge estimator of the VAR(1) model is a worthy competitor to the SCAD estimator of Abegaz and Wit (2013). Concerning the Frobenius loss the ridge estimator seems even preferable, while for more high-dimensional settings the edge selection properties of the ridge estimator are not inferior to that of its SCAD counterpart.
Illustration of the time-series chain graphs underlying the various vector autoregressive models estimated using ridge penalized maximum likelihood.
This technique aims to unravel the dynamic interrelatedness of the variates (e.g., mRNA genes) of a single molecular level (e.g., mRNA gene expression). The model thus explains the temporal dependencies among the genes and captures the contemporaneous ones (through the inverse of the error covariance matrix).
The previous technique is extended to assess dynamic dependencies over a longer time range than that implied by the VAR(1) model. This is done through the VAR(2) model, which includes an additional explanatory time point, that is the two time points directly preceding the current one may both contribute to the observed variation in the latter.
Using the fused VAR(1) model differences among the groups' interaction networks may be identified. Hereto a group-wise VAR(1) model is assumed but fitted jointly to facilitate the borrowing of information when they share network features.
When information on additional molecular levels (e.g., DNA copy number or microRNA gene expression) is available, those levels may be incorporated into the network. The VARX(1) model integrates time-varying covariates from other molecular levels (corresponding to the “X” in VARX) into the VAR(1) model.
The R-package ragt2ridges
depends on rags2ridges
and on R >= 3.0.0 and is also available from CRAN. This requires the package devtools
:
devtools::install_github("viktormiok/ragt2ridges", build_vignettes=TRUE)
Please restart R before loading the package and its documentation:
library(ragt2ridges)
utils::help(ragt2ridges)
utils::vignette("ragt2ridges")
All the data required for performing temporal integrative genomics analysis and published in the reference articles have been deposited in the National Center for Biotechnology Information Gene Expression Omnibus (GEO) and are accessible through the GEO Series accession numbers:
Data type | GEO number |
---|---|
CGH Arrays | GSE138724 |
mRNA Arrays | GSE138079 |
miRNA Arrays | GSE78279 |
To access one of the data sets for instance GSE78279 you need to run the code below. Unpacking the data requires tar and gunzip, which should already be available on most systems.
cd ../ #To get to the main GitHub repo folder
mkdir -p data/tigaR_data_analysis/
cd data/tigaR_data_analysis/
wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE78nnn/GSE78279/suppl/GSE78279_RAW.tar
mkdir GSE78279_RAW
tar -C GSE78279_RAW -xvf GSE78279_RAW.tar
gunzip GSE78279_RAW/*_Regional_*
Please see the following tutorials for detailed examples of how to use ragt2ridges
:
ragt2ridges
is distributed under the GPL-3.0 License. Please read the license before using ragt2ridges
, which it is distributed in the LICENSE
file.
Publications related to ragt2ridges
include:
- Miok, V., Wilting, S.M., van Wieringen, W.N. (2017),
"Ridge estimation of the VAR(1) model and its time series chain graph from multivariate time-course omics data".
Biometrical Journal, 59(1): 172-191. - Babion, I., Miok, V., Jaspers, A., Huseinovic, A., Steenbergen, R.D., van Wieringen, W.N., Wilting, S.M. (2018), "Comprehensive molecular profiling of HPV-induced transformation over time",
Cancer Research, 78, (13 Supplement), 5059-5059 - Miok, V., Wilting, S.M., van Wieringen, W.N. (2019), "Ridge estimation of network models from time-course omics data", Biometrical Journal, 61(2):391-405.
- Babion, I., Miok, V., Jaspers, A., Huseinovic, A., Steenbergen, R.D., van Wieringen, W.N., Wilting, S.M. (2020), "Identification of Deregulated Pathways, Key Regulators, and Novel miRNA-mRNA Interactions in HPV-Mediated Transformation",
Cancers, 12(3), 700. - van Wieringen, W.N. (2018), "ragt2ridges: Ridge Estimation of Vector Auto-Regressive (VAR) Processes". R package, version 0.3.2
Please cite the relevant publications if you use ragt2ridges
.