This is about using regression on several non-stationary (integrated, stochastic trend) series, or/and finding relationships among such series. We shall also go back to “lower dimension” for the time being.
Punchline: Using regression on stochastic trend series themselves has issues. Doing this on their stationary increment series is better, but more interesting, the so-called cointegrating relationships are even more revealing and more powerful.
library(tseries)
library(urca)
data(Raotbl3)
attach(Raotbl3)
# lc = log consumption expenditure
# li = log income
# lw = log wealth
lc <- ts(lc, start=c(1966,4), end=c(1991,2), frequency=4)
li <- ts(li, start=c(1966,4), end=c(1991,2), frequency=4)
lw <- ts(lw, start=c(1966,4), end=c(1991,2), frequency=4)
# restricting as considered in Pfaff, Ch. 7
ukcons <- window(cbind(lc, li, lw), start=c(1967, 2), end=c(1991,2))
ts.plot(ukcons,main="Quarterly UK log consumption/income/wealth",xlab="Year",lty=c(1,2,3))
legend('right',c('consumption','income','wealth'),lty=c(1,2,3))
par(mfrow = c(1, 3))
ts.plot(lc,main="Quarterly UK log consumption",xlab="Year")
ts.plot(li,main="Quarterly UK log income",xlab="Year")
ts.plot(lw,main="Quarterly UK log wealth",xlab="Year")
library(urca)
#ur.lc <- ur.df(lc, lags=3, type='trend')
#ur.lc <- ur.df(lc, lags=3, type='drift')
#summary(ur.lc)
#plot(ur.lc)
#ur.li <- ur.df(li, lags=0, type='trend')
#ur.li <- ur.df(li, lags=0, type='drift')
#summary(ur.li)
#plot(ur.li)
#ur.lw <- ur.df(lw, lags=0, type='trend')
#ur.lw <- ur.df(lw, lags=0, type='drift')
ur.lw <- ur.df(lw, lags=0, type='none')
summary(ur.lw)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.158325 -0.020807 0.005545 0.024886 0.088494
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 0.0006217 0.0003254 1.911 0.059 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04231 on 97 degrees of freedom
## Multiple R-squared: 0.03627, Adjusted R-squared: 0.02633
## F-statistic: 3.65 on 1 and 97 DF, p-value: 0.05902
##
##
## Value of test-statistic is: 1.9106
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.6 -1.95 -1.61
plot(ur.lw)
Conclusions following the unit root testing flowchart from Lecture Univariate II:
log consumption (lc): the model \(\Delta X_t = \beta_1 \Delta X_{t-1} + \beta_2 \Delta X_{t-2} + \beta_3 \Delta X_{t-3} + Z_t\) with \(\{Z_t\}\sim\mbox{WN}(0,\sigma^2_Z)\)
log income (li): the same model \(\Delta X_t = Z_t\) with \(\{Z_t\}\sim\mbox{WN}(0,\sigma^2_Z)\) (this is at 1%).
log wealth (lw): the model \(\Delta X_t = Z_t\) with \(\{Z_t\}\sim\mbox{WN}(0,\sigma^2_Z)\)
summary(lm(lc ~ li, data=ukcons))
##
## Call:
## lm(formula = lc ~ li, data = ukcons)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.065902 -0.016474 0.001461 0.017803 0.057107
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.19799 0.14997 -1.32 0.19
## li 1.00894 0.01376 73.33 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.024 on 95 degrees of freedom
## Multiple R-squared: 0.9826, Adjusted R-squared: 0.9825
## F-statistic: 5377 on 1 and 95 DF, p-value: < 2.2e-16
summary(lm(li ~ lw, data=ukcons))
##
## Call:
## lm(formula = li ~ lw, data = ukcons)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.27407 -0.06104 0.05010 0.08426 0.19018
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.52455 0.54894 10.06 < 2e-16 ***
## lw 0.40887 0.04176 9.79 4.64e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1263 on 95 degrees of freedom
## Multiple R-squared: 0.5022, Adjusted R-squared: 0.497
## F-statistic: 95.85 on 1 and 95 DF, p-value: 4.636e-16
library(lmtest)
set.seed(123456)
e1 <- rnorm(500)
e2 <- rnorm(500)
#trd <- 1:500
# y1 <- 0.8*trd + cumsum(e1)
# y2 <- 0.6*trd + cumsum(e2)
y1 <- cumsum(e1)
y2 <- cumsum(e2)
par(mfrow = c(1, 2))
ts.plot(y1)
ts.plot(y2)
sr.reg <- lm(y1 ~ y2)
sr.dw <- dwtest(sr.reg)$statistic
summary(sr.reg)
##
## Call:
## lm(formula = y1 ~ y2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.5789 -2.3936 -0.2456 2.1271 9.7228
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.99093 0.26909 40.84 <2e-16 ***
## y2 -0.53774 0.01809 -29.73 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.434 on 498 degrees of freedom
## Multiple R-squared: 0.6395, Adjusted R-squared: 0.6388
## F-statistic: 883.6 on 1 and 498 DF, p-value: < 2.2e-16
sr.dw
## DW
## 0.1021637
# Small Durbin-Watson statistic; see e.g. https://en.wikipedia.org/wiki/Durbin–Watson_statistic
# Spurious regression is also known as the one in which R^2 > DW
For example, the following result is known (Hamilton (1994), p. 560). Consider a bivariate time series \(\mathbf{Y} = \{(Y_{1t},Y_{2t})'\}_{t\in\mathbb{Z}}\) where \(\{Y_{1t}\}_{t\in\mathbb{Z}}\) and \(\{Y_{2t}\}_{t\in\mathbb{Z}}\) are two independent random walks defined through
\[\begin{eqnarray}
Y_{1t} & = & Y_{1,t-1} + \epsilon_{1t},\\
Y_{2t} & = & Y_{2,t-1} + \epsilon_{2t},
\end{eqnarray}\]
where \(\{\epsilon_{1t}\}_{t\in\mathbb{Z}}\) and \(\{\epsilon_{2t}\}_{t\in\mathbb{Z}}\) are independent series of i.i.d. variables having means \(0\) and variances \(\sigma^2_1\) and \(\sigma^2_2\), respectively. Then, the OLS estimators \(\widehat \alpha_T\) and \(\widehat \gamma_T\) in the regression \(Y_{1t} = \alpha + \gamma Y_{2t} + u_t\) can be shown to satisfy, for large sample size \(T\), \[ \left( \begin{array}{c} T^{-1/2}\widehat \alpha_T\\ \widehat \gamma_T \end{array} \right) \approx \left( \begin{array}{c} \sigma_1 \cdot h_1\\ \frac{\sigma_1}{\sigma_2}\cdot h_2 \end{array} \right), \] where \(h_1\) and \(h_2\) are certain non-Gaussian random variables (that do not depend on \(\sigma_1^2\) or \(\sigma_2^2\)).
One “obvious” remedy is to consider stationary increments. For others (e.g. including lagged variables): see Hamilton (1994), p. 561.
summary(lm(diff(lc) ~ diff(li), data=ukcons))
##
## Call:
## lm(formula = diff(lc) ~ diff(li), data = ukcons)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.047274 -0.006248 0.001609 0.006785 0.047421
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.004858 0.001424 3.411 0.000957 ***
## diff(li) 0.244151 0.073899 3.304 0.001350 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01312 on 94 degrees of freedom
## Multiple R-squared: 0.104, Adjusted R-squared: 0.09451
## F-statistic: 10.92 on 1 and 94 DF, p-value: 0.00135
summary(lm(diff(li) ~ diff(lw), data=ukcons))
##
## Call:
## lm(formula = diff(li) ~ diff(lw), data = ukcons)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.049859 -0.009938 0.000173 0.010431 0.053542
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.006413 0.001898 3.379 0.00106 **
## diff(lw) 0.017723 0.044081 0.402 0.68856
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0183 on 94 degrees of freedom
## Multiple R-squared: 0.001717, Adjusted R-squared: -0.008903
## F-statistic: 0.1616 on 1 and 94 DF, p-value: 0.6886
Disadvantages of differencing (Pfaff (2008), Ch. 4, p. 75): “First, differencing greatly attenuates large positive residual autocorrelation; hence, false inferences upon the coefficients in the regression equation could be drawn. Second, most economic theories are expressed in levels, and the implications of the long-run relationships between variables are deduced.” (For an explanation of the first disadvantage, see Hamilton (1994), p. 562.) A third disadvantage is that just by looking at differences, one may miss other important, now called cointegrating, relationships.
There is something interesting happening with the residuals of the regressions above.
library(urca)
lc.eq <- summary(lm(lc ~ li, data=ukcons))
lw.eq <- summary(lm(li ~ lw, data=ukcons))
error.lc <- ts(resid(lc.eq), start=c(1967,2), end=c(1991,2), frequency=4)
error.lw <- ts(resid(lw.eq), start=c(1967,2), end=c(1991,2), frequency=4)
par(mfrow = c(1, 2))
ts.plot(error.lc)
ts.plot(error.lw)
par(mfrow = c(1, 2))
pacf(error.lc, lag.max = 20)
pacf(error.lw, lag.max = 20)
ci.lc <- ur.df(error.lc, lags=1, type='none')
ci.lw <- ur.df(error.lw, lags=0, type='none')
#ci.lw <- ur.df(error.lw, lags=0, type='trend')
summary(ci.lc)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.059909 -0.011226 0.001923 0.012520 0.040189
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -0.20001 0.07676 -2.606 0.010681 *
## z.diff.lag -0.36809 0.09652 -3.814 0.000246 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01646 on 93 degrees of freedom
## Multiple R-squared: 0.2717, Adjusted R-squared: 0.2561
## F-statistic: 17.35 on 2 and 93 DF, p-value: 3.945e-07
##
##
## Value of test-statistic is: -2.6055
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.6 -1.95 -1.61
summary(ci.lw)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.046366 -0.011545 -0.000162 0.018536 0.074973
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -0.03721 0.01994 -1.867 0.0651 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02448 on 95 degrees of freedom
## Multiple R-squared: 0.03537, Adjusted R-squared: 0.02522
## F-statistic: 3.484 on 1 and 95 DF, p-value: 0.06505
##
##
## Value of test-statistic is: -1.8665
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.6 -1.95 -1.61
The test statistics need to be compared to the following critical value (from MacKinnon (2010)):
T <- length(lc)
# N=1, 5%, "nc"
binf <- -1.94100
b1 <- -0.2686
b2 <- -3.365
b3 <- 31.223
cr_value <- binf + b1/T + b2/T^2 + b3/T^3
cr_value
## [1] -1.944024
This suggests that a linear combination of lc and li is stationary, but that this is not the case for li and lw. Let us examine the following example where a linear combination of two variables is stationary, and also see how differencing would fair.
Example 3.1: Consider the bivariate series \(\mathbf{Y} = \{(Y_{1t},Y_{2t})'\}_{t\in\mathbb{Z}}\) defined as \[\begin{eqnarray} Y_{1t} & = & \gamma Y_{2,t-1} + \gamma_1 \Delta Y_{2,t-1} + \epsilon_{1t},\\ Y_{2t} & = & Y_{2,t-1} + \epsilon_{2t}, \end{eqnarray}\]where \(\gamma\), \(\gamma_1\) are parameters parameter, and \(\{\epsilon_{1t}\}_{t\in\mathbb{Z}}\) and \(\{\epsilon_{2t}\}_{t\in\mathbb{Z}}\) are independent series of i.i.d. variables having means \(0\) and variances \(\sigma^2_1\) and \(\sigma^2_2\), respectively. That is, \(\{Y_{2t}\}\) is a random walk, and a linear combination of \(Y_{1t}\) and \(Y_{2t}\) is stationary.
Note: Analogy when \(\gamma=1\), \(\gamma_1=0\): \(Y_{2t}=\) position of a drunkard, and \(Y_{1t}=\) position of his (drunk?) dog on a leash.
set.seed(123456)
n <- 100 # n is T
sigma1 <- 1
sigma2 <- 1
gamma <- 0.4
gamma1 <- 0
#gamma <- 1
y20 <- 3
y1 = NULL ;
y2 = NULL ;
y2[1] = y20 + rnorm(1,0,sigma2)
y1[1] = y2[1] + rnorm(1,0,sigma1)
for (i in 2:n) {
y2[i] = y2[i-1] + rnorm(1,0,sigma2)
y1[i] = gamma*y2[i-1] + gamma1*(y2[i] - y2[i-1]) + rnorm(1,0,sigma1)
}
plot.ts(y1,ylim=c(-8,10))
lines(y2, lty='dotted')
where \(\gamma Y_{2,t-1} - Y_{1,t-1}\) is stationary.
# Differenced series
dy1 = ts(diff(y1))
dy2 = ts(diff(y2))
# Combine for later
series <- window(cbind(y1, y2))
Vt <- window(cbind(dy1,dy2))
# Plot series
ts.plot(series,main="Both time series",xlab="Time",lty=c(1,2))
legend('topleft',c('y1','y2'),lty=c(1,2))
# Plot differenced series
ts.plot(Vt,main="Both differenced time series",xlab="Time",lty=c(1,2))
legend('topleft',c('dy1','dy2'),lty=c(1,2))
###################### MODEL 1 ########################
# Simple VAR approach
library(vars)
VARselect(Vt, lag.max=4, type="const")$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 4 2 2 4
mod1 <- VAR(Vt, p=2, type="none")
###################### MODEL 2 ########################
# Cointegration approach: Two-step Engle-Granger procedure
# Step 1: regress variables and get error term
lr.reg <- lm(y1 ~ 0+y2) #long-run equation
summary(lr.reg) #estimate should be close to gamma
##
## Call:
## lm(formula = y1 ~ 0 + y2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.5503 -0.5182 0.1684 0.9158 2.3794
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y2 0.38860 0.01818 21.38 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.053 on 99 degrees of freedom
## Multiple R-squared: 0.8219, Adjusted R-squared: 0.8201
## F-statistic: 457 on 1 and 99 DF, p-value: < 2.2e-16
error <- ts(residuals(lr.reg)) #error terms
#Check: error ought to be stationary
ts.plot(error)
# Step 2: create ECM model
lerror <- error[-c(n-1,n)] #lagged error
ldy1 <- dy1[-(n-1)] #lagged differences
ldy2 <- dy2[-(n-1)]
# Regress on own history, lagged changes of other variable and regression error
mod2.y1 <- lm(dy1[-1] ~ lerror + ldy1 + ldy2)
mod2.y2 <- lm(dy2[-1] ~ lerror + ldy1 + ldy2)
##################### Comparison of methods #######################
### Comparison 1: Fit model to first half of data,
# then compute 1-step ahead
# predictions for second half of data
# Generate next n observations:
for (i in (n+1):(2*n)) {
y2[i] <- y2[i-1] + rnorm(1,0,sigma2)
y1[i] <- gamma*y2[i] + rnorm(1,0,sigma1)
}
dy1 <- ts(diff(y1))
dy2 <- ts(diff(y2))
Vt <- window(cbind(dy1,dy2))
# For cointegration model,need errors again
error.new <- ts(residuals(lm(y1 ~ 0+y2)))
# Forecast 1-step-ahead predictions
mod1.pred1 = NULL
mod2.pred1 = NULL
for (i in (n+1):(2*n-1)) {
mod1.pred1[i] = ( Bcoef(mod1)[,1:2] %*% matrix(Vt[i-1,]) + Bcoef(mod1)[,3:4] %*% matrix(Vt[i-2,]))[1]
mod2.pred1[i] = mod2.y1$coefficients %*% c(1,error.new[i-1],dy1[i-1],dy2[i-1])
}
preds1 <- window(cbind(dy1, mod1.pred1[1:(2*n-1)],mod2.pred1[1:(2*n-1)]))
ts.plot(preds1,lty=c(1,2,3),lwd = c(1,2,2),main="Comparison of models",xlim=c(n-2,2*n),col=c("black","red","blue"))
legend('topleft',c('dy1','VAR prediction','ECM prediction'),lty=c(1,2,3),lwd = c(1,2,2),col=c("black","red","blue"))
# Sum of squared prediction error
sum((mod1.pred1[(n+1):(2*n-1)] - dy1[(n+1):(2*n-1)])^2)
## [1] 140.7708
sum((mod2.pred1[(n+1):(2*n-1)] - dy1[(n+1):(2*n-1)])^2)
## [1] 101.7806
### Comparison 2: Plot 1-step ahead predictions at each step
mod1.pred2 = NULL ;
mod2.pred2 = NULL ;
for (i in 3:n) {
mod1.pred2[i] = ( Bcoef(mod1)[,1:2] %*% matrix(Vt[i-1,]) + Bcoef(mod1)[,3:4] %*% matrix(Vt[i-2,]))[1]
mod2.pred2[i] = mod2.y1$coefficients %*% c(1,error[i-1],dy1[i],dy2[i])
}
plot(2:n,dy1[1:n-1],lty=1,main="Comparison of models",xlab="Time",ylab="dy1")
lines(2:n,dy1[1:n-1])
points(3:n,mod1.pred2[3:n],pch=2,col="red")
points(3:n,mod2.pred2[3:n],pch=3,col="blue")
legend('bottomright',c('dy1','VAR prediction','ECM prediction'),pch=c(1,2,3),col=c("black","red","blue"))
grid()
# Sum of squared error
sum((mod1.pred2[3:n] - dy1[2:(n-1)])^2)
## [1] 371.9602
sum((mod2.pred2[3:n] - dy1[2:(n-1)])^2)
## [1] 100.4467
The more theoretical goals are to lay a basic foundation for the phenomenon of cointegration, and to explain what basic models are being used. We shall then come back to the data set of UK consumption/income/wealth to fit such a model.
Definition 3.1: The components series \(\{X_{1t}\},\ldots,\{X_{dt}\}\) of the vector series \(\{\mathbf{X}_t\}\) are said to be cointegrated if (a) all components of \(\mathbf{X}_t\) are \(I(1)\) (that is, integrated of order \(1\)) and (b) a vector \(\mathbf{\alpha}\neq 0\) exists so that \(Z_t = \mathbf{\alpha}'\mathbf{X}_t\) is \(I(0)\) (that is, stationary). The vector \(\mathbf{\alpha}\) is called the cointegrating vector. The number \(r\) of linearly independent cointegrating vectors is called cointegrating rank of the vector series \(\{\mathbf{X}_t\}\).
Notes
In the data example above, the UK log consumption and income series would be called cointegrated.
In the discussion concerning the bivariate series \(\mathbf{Y}_t = (Y_{1t},Y_{2t})'\) in Example 3.1 above, it proved useful to fit the model (after changing the notation from \(\mathbf{Y}_t\) to \(\mathbf{X}_t\)) \[ \Delta \mathbf{X}_t = A Z_{t-1} + \Gamma \Delta \mathbf{X}_{t-1} + \mathbf{U}_t, \] where \(A\) is a \(2\times 1\) vector, \(\Gamma\) is a \(2\times 2\) matrix, \(\{\mathbf{U}_t\}\sim \mbox{WN}(0,\Sigma_{\mathbf U})\) and \(Z_t = \alpha' \mathbf{X}_t\) is thought as stationary. In this relation, \(\alpha' \mathbf{X}_t = \alpha_1 X_{1t} + \alpha_2 X_{2t}=0\) is viewed as a long-run equilibrium between the series \(\{X_{1t}\}\) and \(\{X_{2t}\}\). The term \(AZ_{t-1}\) is viewed as an error-correction term (correcting the errors \(\mathbf{U}_t\)) that determines explicitly how \(X_{1t}\) and \(X_{2t}\) react to deviations from the long-run equilibrium.
Note that the model above can also be expressed as \[ \Delta \mathbf{X}_t = \Pi \mathbf{X}_{t-1} + \Gamma \Delta \mathbf{X}_{t-1} + \mathbf{U}_t, \] where \(\Pi = A\alpha'\) is a \(2\times 2\) matrix having rank \(1\). This model is generalized to the following vector error correction model (VECM).
Definition 3.2: A vector error correction model, VECM for short, is expressed as either \[ \Delta \mathbf{X}_t = \Pi \mathbf{X}_{t-p} + \Gamma_1 \Delta \mathbf{X}_{t-1} + \ldots + \Gamma_{p-1} \Delta \mathbf{X}_{t-p+1} + \mathbf{U}_t \] in the so-called long-run form, or \[ \Delta \mathbf{X}_t = \Pi \mathbf{X}_{t-1} + \widetilde \Gamma_1 \Delta \mathbf{X}_{t-1} + \ldots + \widetilde \Gamma_{p-1} \Delta \mathbf{X}_{t-p+1} + \mathbf{U}_t \] in the so-called transitory form. (Note that these models can also be written as possibly non-stable VAR\((p)\) models.)
Three cases of matrices \(\Pi\) are considered:
The case (i) corresponds to stationary VAR series \(\mathbf{X}\). The case (ii) corresponds to a VAR-model in first differences. The interesting case is the third one. It can be shown to correspond to the component series of \(\mathbf{X}\) being cointegrated, with the cointegrating rank \(r = \mbox{rk}(\Pi)\).
There are several test to estimate the cointegrating rank (through the matrix rank \(\mbox{rk}(\Pi)\)) and the cointegrating vectors. Here is a simulated data example from Pfaff (2008).
library(urca)
set.seed(12345)
e1 <- rnorm(250, 0, 0.5)
e2 <- rnorm(250, 0, 0.5)
e3 <- rnorm(250, 0, 0.5)
u1.ar1 <- arima.sim(model = list(ar = 0.75),
innov = e1, n = 250)
u2.ar1 <- arima.sim(model = list(ar = 0.3),
innov = e2, n = 250)
y3 <- cumsum(e3)
y1 <- 0.8 * y3 + u1.ar1
y2 <- -0.3 * y3 + u2.ar1
y.mat <- data.frame(y1, y2, y3)
vecm <- ca.jo(y.mat)
jo.results <- summary(vecm)
jo.results
##
## ######################
## # Johansen-Procedure #
## ######################
##
## Test type: maximal eigenvalue statistic (lambda max) , with linear trend
##
## Eigenvalues (lambda):
## [1] 0.27036254 0.15474942 0.01884032
##
## Values of teststatistic and critical values of test:
##
## test 10pct 5pct 1pct
## r <= 2 | 4.72 6.50 8.18 11.65
## r <= 1 | 41.69 12.91 14.90 19.19
## r = 0 | 78.17 18.90 21.07 25.75
##
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
##
## y1.l2 y2.l2 y3.l2
## y1.l2 1.000000 1.0000000 1.0000000
## y2.l2 -4.732436 0.2273774 0.1513858
## y3.l2 -2.129850 -0.6657324 2.3153224
##
## Weights W:
## (This is the loading matrix)
##
## y1.l2 y2.l2 y3.l2
## y1.d -0.034235501 -0.29705774 -0.008294582
## y2.d 0.145988517 -0.08137695 0.003335110
## y3.d 0.002429191 0.01025326 -0.010523045
vecm.r2 <- cajorls(vecm, r = 2)
vecm.r2
## $rlm
##
## Call:
## lm(formula = substitute(form1), data = data.mat)
##
## Coefficients:
## y1.d y2.d y3.d
## ect1 -0.331293 0.064612 0.012682
## ect2 0.094473 -0.709385 -0.009165
## constant 0.168371 -0.027019 0.025255
## y1.dl1 -0.227677 0.027012 0.068158
## y2.dl1 0.144452 -0.715607 0.040487
## y3.dl1 0.123467 -0.290828 -0.075251
##
##
## $beta
## ect1 ect2
## y1.l2 1.0000000 0.0000000
## y2.l2 0.0000000 1.0000000
## y3.l2 -0.7328534 0.2951962
#class(jo.results)
#slotNames(jo.results)
vecm.uk <- ca.jo(ukcons)
jo.results.uk <- summary(vecm.uk)
jo.results.uk
##
## ######################
## # Johansen-Procedure #
## ######################
##
## Test type: maximal eigenvalue statistic (lambda max) , with linear trend
##
## Eigenvalues (lambda):
## [1] 0.22714805 0.04881869 0.00419976
##
## Values of teststatistic and critical values of test:
##
## test 10pct 5pct 1pct
## r <= 2 | 0.40 6.50 8.18 11.65
## r <= 1 | 4.75 12.91 14.90 19.19
## r = 0 | 24.48 18.90 21.07 25.75
##
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
##
## lc.l2 li.l2 lw.l2
## lc.l2 1.00000000 1.0000000 1.0000000
## li.l2 -0.93033927 -0.3751971 -1.0868675
## lw.l2 -0.06256788 -0.3657988 -0.2263311
##
## Weights W:
## (This is the loading matrix)
##
## lc.l2 li.l2 lw.l2
## lc.d 0.1025904 -0.03008945 -0.0071639639
## li.d 0.5783467 -0.01519047 0.0003830183
## lw.d 0.3317454 0.08143889 -0.0300075356
vecm.r1 <- cajorls(vecm.uk, r = 1)
vecm.r1
## $rlm
##
## Call:
## lm(formula = substitute(form1), data = data.mat)
##
## Coefficients:
## lc.d li.d lw.d
## ect1 0.10259 0.57835 0.33175
## constant 0.02262 0.10149 0.06064
## lc.dl1 -0.21064 0.41451 0.01562
## li.dl1 0.18167 -0.55146 -0.10535
## lw.dl1 0.07722 0.01799 0.18509
##
##
## $beta
## ect1
## lc.l2 1.00000000
## li.l2 -0.93033927
## lw.l2 -0.06256788
The following example was considered in Multivariate I lecture. Are the cumulative sums of income/consumption cointegrated?
library(fpp)
data(usconsumption)
in2 <- cumsum(usconsumption[,1])
co2 <- cumsum(usconsumption[,2])
us2 <- data.frame(in2,co2)
vecm.us2 <- ca.jo(us2)
jo.results.us2 <- summary(vecm.us2)
jo.results.us2
##
## ######################
## # Johansen-Procedure #
## ######################
##
## Test type: maximal eigenvalue statistic (lambda max) , with linear trend
##
## Eigenvalues (lambda):
## [1] 0.09429336 0.01005329
##
## Values of teststatistic and critical values of test:
##
## test 10pct 5pct 1pct
## r <= 1 | 1.64 6.50 8.18 11.65
## r = 0 | 16.04 12.91 14.90 19.19
##
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
##
## in2.l2 co2.l2
## in2.l2 1.000000 1.0000000
## co2.l2 -1.074504 -0.9064512
##
## Weights W:
## (This is the loading matrix)
##
## in2.l2 co2.l2
## in2.d 0.02978937 -0.011096130
## co2.d 0.19524476 -0.003174373
vecm.us2.r1 <- cajorls(vecm.us2, r = 1)
vecm.us2.r1
## $rlm
##
## Call:
## lm(formula = substitute(form1), data = data.mat)
##
## Coefficients:
## in2.d co2.d
## ect1 0.02979 0.19524
## constant 0.58768 1.28709
## in2.dl1 0.31202 0.58639
## co2.dl1 0.06219 -0.36227
##
##
## $beta
## ect1
## in2.l2 1.000000
## co2.l2 -1.074504
Question: Which model would your prefer?
Q0: Supplement your analysis of multivariate time series with VECM modeling if seems appropriate. (Even if the series are stationary, are cumulative series cointegrated?)
Q1: Show that the OLS estimators in the regression of independent random walks indeed have the asymptotics stated above.
Q2: Show that in the VECM, the cointegration rank is indeed the rank of the matrix \(\Pi\) (when the latter is strictly between \(0\) and \(d\)).
Q3: Explain how the Johansen eigenvalue test statistic for cointegration is obtained and prove its asymptotics.
Q4: Interpret cointegration in the spectral domain as mentioned above.
Banerjee, A., J. J. Dolado, J. W. Galbraith, and D. Hendry. 1993. Co-Integration, Error Correction, and the Econometric Analysis of Non-Stationary Data. Oxford University Press.
Croux, C., and I. Wilms. 2015. “Sparse Cointegration.” Preprint.
Hamilton, J. D. 1994. Time Series Analysis. Princeton University Press, Princeton, NJ.
Hansen, P. R., and S. Johansen. 1998. Workbook on Cointegration. Oxford University Press.
Johansen, S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford University Press.
Koo, B., H. M. Anderson, M. H. Seo, and W. Yao. 2016. “High-Dimensional Predictive Regression in the Presence of Cointegration.” Preprint.
Lütkepohl, H. 2005. New Introduction to Multiple Time Series Analysis. Springer-Verlag, Berlin.
MacKinnon, J. G. 2010. “Critical Values for Cointegration Tests.” Queen’s Economics Department Working Paper.
Murray, M. P. 1994. “A Drunk and Her Dog: An Illustration of Cointegration and Error Correction.” The American Statistician 48 (1): 37–39.
Pfaff, B. 2008. Analysis of Integrated and Cointegrated Time Series with R. Second. Use R! Springer, New York.