What is this all about?

This is about some time series models for count data.

Some examples

Example 1: The monthly number of cases of poliomyelitis in the United States between 1970 and 1983.

library(acp)
data(polio)
plot.ts(polio)

ts1 <- polio$polio
acf(ts1,lag.max = 20)

#pacf(ts1,lag.max = 20)

mean(ts1)
## [1] 1.333333
var(ts1)
## [1] 3.50499
plot(table(ts1)/length(ts1),ylab="probs",xlab="values") 
points(0:15, dpois(0:15,lambda=mean(ts1)))

Example 2: The number of cases of campylobacter infections in the north of the province Quebec (Canada) in four week intervals from January 1990 to the end of October 2000.

library(tscount)

data(campy)
plot.ts(campy)

ts2 <- campy

acf(ts2,lag.max = 20)

#pacf(ts1,lag.max = 20)

mean(ts2)
## [1] 11.54286
var(ts2)
## [1] 53.24275
plot(table(ts2)/length(ts2),ylab="probs",xlab="values") 
points(0:55, dpois(0:55,lambda=mean(ts2)))

Example 3: The monthly totals of car drivers in Great Britain killed or seriously injured Jan 1969 to Dec 1984. Compulsory wearing of seat belts was introduced on 31 Jan 1983.

data(Seatbelts)

ts3 <- Seatbelts[, "VanKilled"]
plot.ts(ts3)

acf(ts3,lag.max = 20)

#pacf(ts1,lag.max = 20)

mean(ts3)
## [1] 9.057292
var(ts3)
## [1] 13.22707
plot(table(ts3)/length(ts3),ylab="probs",xlab="values") 
points(0:55, dpois(0:55,lambda=mean(ts3)))

Data/Theory goals of the lecture

The goals is to discuss count time series models based on GLM and illustrate them on data above. The main R package used is \(\verb!tscount!\), though there are a number of others. See Liboschik et al. (2016) and references therein.

Count time series models based on GLM

Let \(\{Y_t\}\) be a count time series and \(\{\mathbf{X}_t\}\) be an \(r\)-vector series of covariates. Let also \({\cal F}_t\) be the history of the joint process \(\{Y_t,\mathbf{X}_{t+1}\}\) up to time \(t\). Set \(\lambda_t = \mathbb{E} (Y_t|{\cal F}_{t-1})\). The general model is of the form \[ g(\lambda_t) = \beta_0 + \sum_{k=1}^p \beta_k \widetilde g(Y_{t-i_k}) + \sum_{\ell=1}^q \alpha_j g(\lambda_{t-j_\ell}) + \mathbf{u}'\mathbf{X}_t, \] where \(g\) is a link function, \(\widetilde g\) is a transformation function and \(P = \{i_1,\ldots,i_p\}\), \(Q = \{j_1,\ldots,j_q\}\) are sets of integers with \(0<i_1<\ldots<i_p\), \(0<j_1<\ldots<j_q\). For example, with \(g(x) = \widetilde g(x) = x\) and \(P=\{1,\ldots,p\}\), \(Q = \{1,\ldots,q\}\) and \(\mathbf{u}=0\), the model is \[ \lambda_t = \beta_0 + \sum_{k=1}^p \beta_k Y_{t-k} + \sum_{\ell=1}^q \alpha_j \lambda_{t-\ell}, \] which is known as integer-valued GARCH model of order \(p\) and \(q\), abbreviated as INGARCH\((p,q)\), and also as autoregressive conditional Poisson (ACP) model. In addition, the distributional assumption on \(Y_t\) given \({\cal F}_{t-1}\) is made, for example, Poisson, negative binomial or other.

The inference is (quasi) maximum likelihood-based, with its advantages for e.g. testing. E.g. 1-step-ahead prediction is \(\lambda_{T+1}\). In model diagnostics, e.g. the following residuals are considered \(y_t - \widehat \lambda_t\). See Liboschik et al. (2016).

Back to examples

Example 1: The monthly number of cases of poliomyelitis in the United States between 1970 and 1983.

# trend=(1:168/168)
# cos12=1+cos((2*pi*(1:168))/12)
# sin12=1+sin((2*pi*(1:168))/12)
# cos6=1+cos((2*pi*(1:168))/6)
# sin6=1+sin((2*pi*(1:168))/6)
#polio_cov <- data.frame(trend, cos12, sin12, cos6, sin6)

#polio_pois <- tsglm(polio$polio, model = list(past_obs = 1, past_mean = c(1,2)), xreg = polio_cov, distr = "poisson")

polio_pois <- tsglm(polio$polio, model = list(past_obs = 1), distr = "poisson")
summary(polio_pois)
## 
## Call:
## tsglm(ts = polio$polio, model = list(past_obs = 1), distr = "poisson")
## 
## Coefficients:
##              Estimate  Std.Error  CI(lower)  CI(upper)
## (Intercept)     0.861     0.0993      0.667       1.06
## beta_1          0.360     0.0665      0.230       0.49
## Standard errors and confidence intervals (level =  95 %) obtained
## by normal approximation.
## 
## Link function: identity 
## Distribution family: poisson 
## Number of coefficients: 2 
## Log-likelihood: -280.4975 
## AIC: 564.995 
## BIC: 571.2429 
## QIC: 564.9943
polio_nbin <- tsglm(polio$polio, model = list(past_obs = 1), distr = "nbinom")
summary(polio_nbin)
## 
## Call:
## tsglm(ts = polio$polio, model = list(past_obs = 1), distr = "nbinom")
## 
## Coefficients:
##              Estimate  Std.Error  CI(lower)  CI(upper)
## (Intercept)     0.861      0.127      0.612      1.111
## beta_1          0.360      0.103      0.157      0.563
## sigmasq         0.533         NA         NA         NA
## Standard errors and confidence intervals (level =  95 %) obtained
## by normal approximation.
## 
## Link function: identity 
## Distribution family: nbinom (with overdispersion coefficient 'sigmasq') 
## Number of coefficients: 3 
## Log-likelihood: -258.1501 
## AIC: 522.3002 
## BIC: 531.6721 
## QIC: 569.0743
par(mfrow=c(2,2))
# Marginal calibration reports (1/T) sum_{t=1}^T P(Y_t < y|F_{t-1}) - (1/T) sum_{t=1}^T 1_{y_t < y} 
# Liboschik et al: "If the predictions from a model are appropriate the marginal distribution of the predictions resembles the marginal distribution of the observations"

acf(residuals(polio_pois), main = "ACF of response residuals")
marcal(polio_pois, ylim = c(-0.13, 0.13), main = "Marginal calibration")
lines(marcal(polio_nbin,plot=FALSE),lty="dashed")

# PIT stands for Probability Integral Transform. If the model is adequate, the PIT should be close to uniform
legend("bottomright",legend = c("Pois","NegBin"), lwd=1,lty=c("solid","dashed"))
pit(polio_pois, ylim = c(0, 1.5), main = "PIT Poisson")
pit(polio_nbin, ylim = c(0, 1.5), main = "PIT Negative Binomial")

# These are mean scores, namely, mean "differences" for the predictive distribution and the observations 
rbind(Poisson = scoring(polio_pois), NegBin = scoring(polio_nbin))
##         logarithmic  quadratic  spherical  rankprob   dawseb    normsq
## Poisson    1.669628 -0.2509310 -0.4942208 0.8352967 2.027905 1.8222479
## NegBin     1.536608 -0.2682583 -0.5152435 0.8012837 1.714268 0.9880942
##          sqerror
## Poisson 3.179298
## NegBin  3.179298
polio_pois_pred <- predict(polio_pois, n.ahead = 10, level = 0.9, global = TRUE)
polio_pois_pred$pred
##  [1] 3.020990 1.948746 1.562809 1.423898 1.373899 1.355903 1.349425
##  [8] 1.347094 1.346255 1.345953
polio_pois_pred$interval
##       lower upper
##  [1,]     0     8
##  [2,]     0     6
##  [3,]     0     6
##  [4,]     0     6
##  [5,]     0     5
##  [6,]     0     6
##  [7,]     0     6
##  [8,]     0     6
##  [9,]     0     5
## [10,]     0     6

The fitted model is: \(Y_t|{\cal F}_{t-1} \sim \mbox{Pois}(\lambda_t)\) and \[ \lambda_t = 0.861 + 0.36 Y_{t-1}. \]

Example 2: The number of cases of campylobacter infections in the north of the province Quebec (Canada) in four week intervals from January 1990 to the end of October 2000.

interventions <- interv_covariate(n = length(campy), tau = c(84, 100), delta = c(1, 0))
campyfit_pois <- tsglm(campy, model = list(past_obs = 1, past_mean = 13), xreg = interventions, distr = "poisson")
campyfit_nbin <- tsglm(campy, model = list(past_obs = 1, past_mean = 13), xreg = interventions, distr = "nbinom")

par(mfrow=c(2,2))
acf(residuals(campyfit_pois), main = "ACF of response residuals")
marcal(campyfit_pois, ylim = c(-0.03, 0.03), main = "Marginal calibration")
lines(marcal(campyfit_nbin,plot=FALSE),lty="dashed")
legend("bottomright",legend = c("Pois","NegBin"), lwd=1,lty=c("solid","dashed"))
pit(campyfit_pois, ylim = c(0, 1.5), main = "PIT Poisson")
pit(campyfit_nbin, ylim = c(0, 1.5), main = "PIT Negative Binomial")

rbind(Poisson = scoring(campyfit_pois), NegBin = scoring(campyfit_nbin))
##         logarithmic   quadratic  spherical rankprob   dawseb    normsq
## Poisson    2.750006 -0.07669318 -0.2751017 2.200186 3.662209 1.3080625
## NegBin     2.722028 -0.07799704 -0.2765909 2.185105 3.605935 0.9642855
##          sqerror
## Poisson 16.51282
## NegBin  16.51282
summary(campyfit_nbin)
## 
## Call:
## tsglm(ts = campy, model = list(past_obs = 1, past_mean = 13), 
##     xreg = interventions, distr = "nbinom")
## 
## Coefficients:
##              Estimate  Std.Error  CI(lower)  CI(upper)
## (Intercept)    3.3184     0.7851     1.7797      4.857
## beta_1         0.3690     0.0696     0.2326      0.505
## alpha_13       0.2198     0.0942     0.0352      0.404
## interv_1       3.0810     0.8560     1.4032      4.759
## interv_2      41.9541    12.0914    18.2554     65.653
## sigmasq        0.0297         NA         NA         NA
## Standard errors and confidence intervals (level =  95 %) obtained
## by normal approximation.
## 
## Link function: identity 
## Distribution family: nbinom (with overdispersion coefficient 'sigmasq') 
## Number of coefficients: 6 
## Log-likelihood: -381.0839 
## AIC: 774.1678 
## BIC: 791.8176 
## QIC: 787.6442

The fitted model is: \(Y_t|{\cal F}_{t-1} \sim \mbox{NegBin}(\lambda_t,33.61)\) and \[ \lambda_t = 3.32 + 0.37 Y_{t-1} + 0.22 \lambda_{t-13} + 3.08 \mathbf{1}_{\{t\geq 84\}} + 41.95 \mathbf{1}_{\{t= 100\}} \]

Example 3: The monthly totals of car drivers in Great Britain killed or seriously injured Jan 1969 to Dec 1984. Compulsory wearing of seat belts was introduced on 31 Jan 1983.

timeseries <- Seatbelts[, "VanKilled"]
regressors <- cbind(PetrolPrice = Seatbelts[, c("PetrolPrice")],linearTrend = seq(along = timeseries)/12)
timeseries_until1981 <- window(timeseries, end = 1981 + 11/12)
regressors_until1981 <- window(regressors, end = 1981 + 11/12)
seatbeltsfit <- tsglm(timeseries_until1981,model = list(past_obs = c(1, 12)), link = "log", distr = "poisson", xreg = regressors_until1981)
#summary(seatbeltsfit, B = 500)
summary(seatbeltsfit)
## 
## Call:
## tsglm(ts = timeseries_until1981, model = list(past_obs = c(1, 
##     12)), xreg = regressors_until1981, link = "log", distr = "poisson")
## 
## Coefficients:
##              Estimate  Std.Error  CI(lower)  CI(upper)
## (Intercept)    1.9037    0.36911     1.1803     2.6272
## beta_1         0.0856    0.08095    -0.0730     0.2443
## beta_12        0.1509    0.08499    -0.0156     0.3175
## PetrolPrice    0.0826    2.30408    -4.4333     4.5985
## linearTrend   -0.0291    0.00793    -0.0446    -0.0135
## Standard errors and confidence intervals (level =  95 %) obtained
## by normal approximation.
## 
## Link function: log 
## Distribution family: poisson 
## Number of coefficients: 5 
## Log-likelihood: -396.3849 
## AIC: 802.7697 
## BIC: 818.019 
## QIC: 802.7697
timeseries_1982 <- window(timeseries, start = 1982, end = 1982 + 11/12)
regressors_1982 <- window(regressors, start = 1982, end = 1982 + 11/12)
predict(seatbeltsfit, n.ahead = 12, level = 0.9, global = TRUE, B = 2000, newxreg = regressors_1982)$pred
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1982 7.707988 7.454226 7.568846 7.409922 7.210433 6.985772 7.145880
##           Aug      Sep      Oct      Nov      Dec
## 1982 7.826338 7.493486 7.816908 8.022388 7.451928
seatbeltsfit_alldata <- tsglm(timeseries, link = "log",model = list(past_obs = c(1, 12)), xreg = regressors, distr = "poisson")
interv_test(seatbeltsfit_alldata, tau = 170, delta = 1, est_interv = TRUE)
## 
##  Score test on intervention(s) of given type at given time
## 
## Chisq-Statistic: 1.152953 on 1 degree(s) of freedom, p-value: 0.2829319
## 
## Fitted model with the specified intervention:
## 
## Call:
## tsglm(ts = fit$ts, model = model_extended, xreg = xreg_extended, 
##     link = fit$link, distr = fit$distr)
## 
## Coefficients:
## (Intercept)       beta_1      beta_12  PetrolPrice  linearTrend  
##     0.19508      0.08819      0.80446      3.17408     -0.04788  
##    interv_1  
##     0.24570

Some final notes

Special cases of the introduced count models can be viewed and treated as Hidden Markov Models (to be discussed next class).

Other types of count time series models have also been considered, for example, based on binomial thinning, aggregation of suitable binomial random variables, …

Homework

Analyze a real count time series data using the methods discussed.

References

Liboschik, T., K. Fokianos, and R. Fried. 2016. “Tscount: An R Package for Analysis of Count Time Series Following Generalized Linear Models.”