-
Notifications
You must be signed in to change notification settings - Fork 12
/
README.Rmd
179 lines (146 loc) · 6.45 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
output:
html_document:
variant: markdown_github
keep_md: true
md_document:
variant: markdown_github
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# covFactorModel
Estimation of covariance matrix via factor models with application
to financial data. Factor models decompose the asset returns into an
exposure term to some factors and a residual idiosyncratic component. The
resulting covariance matrix contains a low-rank term corresponding to the
factors and another full-rank term corresponding to the residual component.
This package provides a function to separate the data into the factor
component and residual component, as well as to estimate the corresponding
covariance matrix. Different kind of factor models are considered, namely,
macroeconomic factor models and statistical factor models. The estimation
of the covariance matrix accepts different kinds of structure on the
residual term: diagonal structure (implying that residual component is
uncorrelated) and block diagonal structure (allowing correlation within
sectors). The package includes a built-in database containing stock symbols
and their sectors.
The package is based on the book:
R. S. Tsay, _Analysis of Financial Time Series._ John Wiley & Sons, 2005.
## Installation
```{r, eval = FALSE}
# Installation from CRAN (not available yet)
#install.packages("covFactorModel")
# Installation from GitHub
# install.packages("devtools")
devtools::install_github("dppalomar/covFactorModel")
# Getting help
library(covFactorModel)
help(package = "covFactorModel")
package?covFactorModel
?factorModel
?covFactorModel
?getSectorInfo
```
## Vignette
For more detailed information, please check the vignette: [GitHub-html-vignette](https://rawgit.com/dppalomar/covFactorModel/master/vignettes/covFactorModel-vignette.html) and [GitHub-pdf-vignette](https://rawgit.com/dppalomar/covFactorModel/master/vignettes/covFactorModel-vignette.pdf).
## Usage of `factorModel()`
The function `factorModel()` builds a factor model for the data, i.e., it decomposes the asset returns into a factor component and a residual component. The user can choose different types of factor models, namely, macroeconomic, BARRA, or statistical. We start by generating some synthetic data:
```{r, message = FALSE, warning = FALSE}
library(covFactorModel)
library(xts)
library(MASS)
# generate synthetic data
set.seed(234)
N <- 3 # number of stocks
T <- 5 # number of samples
mu <- rep(0, N)
Sigma <- diag(N)/1000
# generate asset returns TxN data matrix
X <- xts(mvrnorm(T, mu, Sigma), order.by = as.Date('2017-04-15') + 1:T)
colnames(X) <- c("A", "B", "C")
# generate K=2 macroeconomic factors
econ_fact <- xts(mvrnorm(T, c(0, 0), diag(2)/1000), order.by = index(X))
colnames(econ_fact) <- c("factor1", "factor2")
```
We first build a _macroeconomic factor model_, which fits the data to the given macroeconomic factors:
```{r}
macro_econ_model <- factorModel(X, type = "Macro", econ_fact = econ_fact)
# sanity check
X_ <- with(macro_econ_model,
matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual)
norm(X - X_, "F")
```
Next, we build a _BARRA industry factor model_ (assuming assets A and C belong to sector 1 and asset B to sector 2):
```{r}
stock_sector_info <- c(1, 2, 1)
barra_model <- factorModel(X, type = "Barra", stock_sector_info = stock_sector_info)
# sanity check
X_ <- with(barra_model,
matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual)
norm(X - X_, "F")
```
Finally, we build a _statistical factor model_, which is based on principal component analysis (PCA):
```{r}
# set factor dimension as K=2
stat_model <- factorModel(X, K = 2)
# sanity check
X_ <- with(stat_model,
matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual)
norm(X - X_, "F")
```
## Usage of `covFactorModel()`
The function `covFactorModel()` estimates the covariance matrix of the data based on factor models. The user can choose not only the type of factor model (i.e., macroeconomic, BARRA, or statistical) but also the structure of the residual covariance matrix (i.e., scaled identity, diagonal, block diagonal, and full).
We start by preparing some synthetic data:
```{r}
library(covFactorModel)
library(xts)
library(MASS)
# generate synthetic data
set.seed(234)
K <- 1 # number of factors
N <- 400 # number of stocks
mu <- rep(0, N)
beta <- mvrnorm(N, rep(1, K), diag(K)/10)
Sigma <- beta %*% t(beta) + diag(N)
print(eigen(Sigma)$values[1:10])
```
Then, we simply use function `covFactorModel()` (by default it uses a _statistical factor model_ and a diagonal structure for the residual covariance matrix). We show the average error w.r.t number of observations:
```{r, warning = FALSE}
# estimate error by loop
err_scm_vs_T <- err_statPCA_diag_vs_T <- c()
index_T <- N*seq(5)
for (T in index_T) {
X <- xts(mvrnorm(T, mu, Sigma), order.by = as.Date('1995-03-15') + 1:T)
# use statistical factor model
cov_statPCA_diag <- covFactorModel(X, K = K, max_iter = 10)
err_statPCA_diag_vs_T <- c(err_statPCA_diag_vs_T, norm(Sigma - cov_statPCA_diag, "F")^2)
# use sample covariance matrix
err_scm_vs_T <- c(err_scm_vs_T, norm(Sigma - cov(X), "F")^2)
}
res <- rbind(err_scm_vs_T, err_statPCA_diag_vs_T)
rownames(res) <- c("SCM", "stat + diag")
colnames(res) <- paste0("T/N=", index_T/N)
print(res)
```
## Usage of `getSectorInfo()`
The function `getSectorInfo()` provides sector information for a given set of stock symbols. The usage is rather simple:
```{r}
library(covFactorModel)
mystocks <- c("AAPL", "ABBV", "AET", "AMD", "APD", "AA","CF", "A", "ADI", "IBM")
getSectorInfo(mystocks)
```
The built-in sector database can be overidden by providing a stock-sector pairing:
```{r}
my_stock_sector_database <- cbind(mystocks, c(rep("sector1", 3),
rep("sector2", 4),
rep("sector3", 3)))
getSectorInfo(mystocks, my_stock_sector_database)
```
## Links
Package: [GitHub](https://github.com/dppalomar/covFactorModel).
README file: [GitHub-readme](https://rawgit.com/dppalomar/covFactorModel/master/README.html).
Vignette: [GitHub-html-vignette](https://rawgit.com/dppalomar/covFactorModel/master/vignettes/covFactorModel-vignette.html) and [GitHub-pdf-vignette](https://rawgit.com/dppalomar/covFactorModel/master/vignettes/covFactorModel-vignette.pdf).