Another common approach to modeling (nonstationary) time series is based on differencing and a related class of ARIMA models. The goal here is to understand the differencing operation, its implications on modeling and when it should be used.
Definition 2.1: The difference operator \(\Delta\) is defined as \[ \Delta X_t = (I-B)X_t = X_t - X_{t-1},\quad t\in\mathbb{Z}, \] where \(X=\{X_t\}_{t\in\mathbb{Z}}\) is a time series, \(B\) is the backward shift operator and \(I=B^0\).
The differencing operation can be aggregated. E.g. \[ \Delta^2 X_t = (I-B) (I-B) X_t = (I-B)(X_t - X_{t-1}) = X_t - 2X_{t-1} + X_{t-2}, \] which is also the same as \((I-B)^2X_t=(I-2B+B^2)X_t\).
Note that \(\Delta X_t = b\) for \(X_t = a+bt\), \(\Delta^2 X_t = 2c\) for \(X_t = a+bt+ct^2\), etc. Then, why not apply \(\Delta\) or \(\Delta^2\) or … to a nonstationary series (in order to remove any trends) and model the residuals as a stationary series?
Recall that we fitted the model \(X_t = 625.55 -0.0242 t + Y_t\) with \(\{Y_t\}\) being the AR(2) series satisfying \(Y_t = 1.0050Y_{t-1} -0.2925Y_{t-2}+Z_t\) with \(\{Z_t\}\sim {\rm WN}(0,0.457)\). What about differencing instead?
data(LakeHuron)
ts <- LakeHuron
ts_diff <- diff(ts, lag=1)
par(mfrow = c(1, 3))
plot(ts_diff,ylab = 'Feet',xlab='Year')
acf(ts_diff, lag.max = 20, ylim=c(-1, 1))
pacf(ts_diff, lag.max = 20, ylim=c(-1, 1))
library(forecast)
auto.arima(ts_diff,max.p=5,max.q=5,ic="aic")
## Series: ts_diff
## ARIMA(0,0,0) with zero mean
##
## sigma^2 estimated as 0.5553: log likelihood=-109.11
## AIC=220.22 AICc=220.26 BIC=222.79
auto.arima(ts_diff,max.p=5,max.q=5,ic="bic")
## Series: ts_diff
## ARIMA(0,0,0) with zero mean
##
## sigma^2 estimated as 0.5553: log likelihood=-109.11
## AIC=220.22 AICc=220.26 BIC=222.79
# See the package manual for "allowdrift"
auto.arima(ts,max.p=5,max.q=5,max.d=2,ic="aic",allowdrift=TRUE)
## Series: ts
## ARIMA(0,1,0)
##
## sigma^2 estimated as 0.5588: log likelihood=-109.11
## AIC=220.22 AICc=220.26 BIC=222.79
auto.arima(ts,max.p=5,max.q=5,max.d=2,ic="bic",allowdrift=TRUE)
## Series: ts
## ARIMA(0,1,0)
##
## sigma^2 estimated as 0.5588: log likelihood=-109.11
## AIC=220.22 AICc=220.26 BIC=222.79
# this uses the functions in "forecast" package
arima.model <- Arima(ts, order=c(0,1,0), method = "ML",include.drift = FALSE)
arima.model
## Series: ts
## ARIMA(0,1,0)
##
## sigma^2 estimated as 0.5588: log likelihood=-109.11
## AIC=220.22 AICc=220.26 BIC=222.79
h = 5
ts.forecast <- forecast(arima.model, h)
plot(ts.forecast)
Note: The forecasts and prediction intervals are quite different from the previous model.
Understand what model is exactly fitted above, and why prediction intervals are different? Which model, the above or what we used in the previous lecture, should be preferred?
Whether a model with a unit root or (possibly trend) stationary model (with nearly unit root) should be used is a statistical question (especially for smaller sample sizes). For example, below are the plots of AR\((1)\) series with \(\phi_1=0.8\) and \(\phi_1=1\) (random walk).
set.seed(123456)
ts_1 <- arima.sim(list(order = c(1,0,0), ar = 0.85), n = 75)
ts_2 <- arima.sim(list(order = c(0,1,0)), n = 75)
par(mfrow = c(1, 2))
plot(ts_1, type="l")
plot(ts_2, type="l")
par(mfrow = c(1, 2))
acf(ts_1, lag.max=20, ylim=c(-1, 1))
acf(ts_2, lag.max=20, ylim=c(-1, 1))
par(mfrow = c(1, 2))
pacf(ts_1, lag.max=20, ylim=c(-1, 1))
pacf(ts_2, lag.max=20, ylim=c(-1, 1))
The confusion is even more delicate in the presence of a trend. E.g. trend stationary (which looks more nonstationary) can be confused easier with random walk.
There are a number of statistical tests to help one decide between the two models. One of the best known is the (augmented) Dickey-Fuller (DF or ADF) test. For example, a typical hypotheses testing problem is:
\(H_0:\) \(\{X_t\}\) is an AR(\(p\)) series with a unit root (that is, \(\phi(1)=0\))
\(H_1:\) \(\{X_t\}\) is a stationary AR(\(p\)) series (that is, \(\phi(1)\neq 0\))
More generally, a (linear) drift or a more general trend can be added to \(\{X_t\}\). The model to be tested is written, in general, as \[ X_t = \beta_0 + \beta_1 t + \beta_2 \frac{t^2}{2} + \phi_1 X_{t-1} + \ldots + \phi_p X_{t-p} + Z_t \] or \[ \Delta X_t = \beta_1 + \beta_2 t + \pi X_{t-1} + \sum_{j=1}^k \gamma_j \Delta X_{t-j} + Z_t \quad (k=p-1) \] with the unit root corresponding to \(\pi = \phi_1 + \phi_2+\ldots+\phi_p-1= 0\) and the null hypothesis to \(H_0: \pi = 0\); \(\beta_2\neq 0\) corresponding to a trend, and \(\beta_1\neq 0,\beta_2=0\) corresponding to a drift.
The test is based on the usual regression but the critical values are no longer those of a \(t\) distribution. (Why?)
ts_model <- auto.arima(ts,max.p=5,max.q=0,max.d=0,ic="aic")
ts_model
## Series: ts
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 intercept
## 1.0436 -0.2495 579.0473
## s.e. 0.0983 0.1008 0.3319
##
## sigma^2 estimated as 0.4939: log likelihood=-103.63
## AIC=215.27 AICc=215.7 BIC=225.61
abs(polyroot(c(1,-ts_model$coef[1],-ts_model$coef[2])))
## [1] 1.486430 2.696429
library(urca)
# See Pfaff's book in the references below
ur.no <- ur.df(ts, lags=1, type='none')
summary(ur.no)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9351 -0.5560 -0.0311 0.4486 2.0002
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -3.405e-05 1.295e-04 -0.263 0.793
## z.diff.lag 1.319e-01 1.001e-01 1.318 0.191
##
## Residual standard error: 0.7344 on 94 degrees of freedom
## Multiple R-squared: 0.01891, Adjusted R-squared: -0.001967
## F-statistic: 0.9058 on 2 and 94 DF, p-value: 0.4077
##
##
## Value of test-statistic is: -0.263
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.6 -1.95 -1.61
plot(ur.no)
ur.tr <- ur.df(ts, lags=1, type='drift')
summary(ur.tr)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.71212 -0.42575 0.00349 0.43147 1.69043
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 124.94994 32.06259 3.897 0.000183 ***
## z.lag.1 -0.21584 0.05538 -3.898 0.000183 ***
## z.diff.lag 0.23757 0.09714 2.446 0.016337 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6846 on 93 degrees of freedom
## Multiple R-squared: 0.156, Adjusted R-squared: 0.1379
## F-statistic: 8.596 on 2 and 93 DF, p-value: 0.0003754
##
##
## Value of test-statistic is: -3.8977 7.6333
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.51 -2.89 -2.58
## phi1 6.70 4.71 3.86
plot(ur.tr)
Unit root testing flowchart from Pfaff (2008) is given below. Though the steps are nearly impossible to understand just by reading Pfaff (2008). See instead Enders (1995), Chapter 4, Section 7.
library(urca)
ur.no <- ur.df(ts, lags=1, type='trend')
summary(ur.no)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6029 -0.4416 0.0248 0.3907 1.6720
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 161.785553 38.979026 4.151 7.4e-05 ***
## z.lag.1 -0.279036 0.067172 -4.154 7.3e-05 ***
## tt -0.004999 0.003063 -1.632 0.10609
## z.diff.lag 0.278779 0.099536 2.801 0.00621 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6785 on 92 degrees of freedom
## Multiple R-squared: 0.1798, Adjusted R-squared: 0.153
## F-statistic: 6.721 on 3 and 92 DF, p-value: 0.0003764
##
##
## Value of test-statistic is: -4.1541 6.0678 9.0636
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2 6.50 4.88 4.16
## phi3 8.73 6.49 5.47
Note: What is the conclusion for the Lake Huron time series?
Note: There are tests for unit roots in moving average part of ARMA model. (What do they serve?)
Q0: Supplement your analysis of time series data from Q1, Lecture I, by unit root tests if seems appropriate.
Q1:* What is a connection between the auto.arima selected model and the model suggested by ADF test? Why was there a contradiction for the lake Huron data above?
Q2: Prove the asymptotics for the ADF statistic under the null (in one of the trend/drift or no trend/drift cases).
Q3: What are other tests besides ADF? What (dis)advantages they have?
Q4: What are fractionally differenced series?
Q5: Can you think of a situation/model that would “break” the unit root testing flowchart?
Choi, I. 2015. Almost All About Unit Roots. Themes in Modern Econometrics. Cambridge University Press, New York.
Enders, W. 1995. Applied Econometric Time Series. John Wiley; Sons.
Maddala, G. S., and I.-M. Kim. 1998. Unit Roots, Cointegration and Structural Change. Themes in Modern Econometrics. Cambridge University Press, Cambridge.
Pfaff, B. 2008. Analysis of Integrated and Cointegrated Time Series with R. Second. Use R! Springer, New York.