14.4 Autocorrelation

The correlation of error terms is called autocorrelation. The issue usually arises if there is a time component in the data. Recall the main types of data available for research:

  • Cross-sectional data (multiple observations at same time point)
  • Time series data (one variable observed over time)
  • Pooled data (multiple observations at different time points)
  • Panel data (same observations at different time points)

There is a distinction between serial correlation and autocorrelation:

  • Serial correlation: Correlation between two series
  • Autocorrelation: Correlation with lagged variables

The OLS estimator is still unbiased but there is no longer minimum variance since \(E(\epsilon_i \epsilon_j) \neq 0\). Autocorrelation is unlikely for cross-sectional data except in the case of spatial auto-correlation. One cause of autocorrelation could be inertia in economic variables. For example, variables such as income, production, or employment increase after a recession. But there are a number of other reasons for autocorrelation.

Autocorrelation could be caused by specification bias due to excluded variables or incorrect functional forms. For example, assume that the correct equation is

\[q_{beef}=\beta_0+\beta_1 \cdot p_{beef}+\beta_2 \cdot p_{income} + \beta_3 \cdot p_{pork}+\epsilon_t\] The estimated equation is: \[q_{beef}=\beta_0+\beta_1 \cdot p_{beef}+\beta_2 \cdot p_{income}+\upsilon_t\]

The error terms in both equations are denoted \(\epsilon_t\) and \(\upsilon_t\), respectively. This results in a systematic patters of \(\upsilon_t\): \[\upsilon_t= \beta_3 \cdot p_{pork}+\epsilon_t\] Correlation between the error terms can also be caused by specifying an incorrect functional form. Assume that the correct equation is written as follows: \[y_i = \beta_0 +\beta_1 \cdot x_i +\beta_2 \cdot x_i^2 +\epsilon_i\] But the estimated equation is \[y_i = \beta_0 +\beta_1 x_i +\epsilon_i\] Serial correlation is caused by lagged terms in the regression equation: \[consumption_t = \beta_0 + \beta_1 \cdot income_t+\beta_3 \cdot consumption_{t-1}+\epsilon_t\] The issues of lagged terms will be covered in the part on dynamic regression and time series and this section serves only as an introduction to first-order autoregressive schemes. Consider the model: \[y_t=\beta_0+\beta_1 \cdot x_t + \upsilon_t\] Assume the following form of \(\upsilon\): \[\upsilon_t = \rho \cdot \upsilon_{t-1} + \epsilon_t\] his is called a first-order autoregressive AR(1) scheme. An AR(2) would be written as \[\upsilon_t = \rho_1 \cdot \upsilon_{t-1} + \rho_2 \cdot \upsilon_{t-2} + \epsilon_t\] This can be illustrated with simulated data. Consider the following model: \[y_t=1+0.8 \cdot x_t + \upsilon_t\] and assume the following form of \(\upsilon\): \[\upsilon_t = 0.7 \cdot \upsilon_{t-1} + \epsilon_t\] 1. Simulate the above model 100 times 2. Compare variance of coefficients under different two different methods: (1) OLS and (2) Cochrane-Orcutt

14.4.1 Durbin Watson d-Test

The test statistic of the Durbin-Watson test is written as: \[d=\frac{\sum_{t=2}^N (e_t-e_{t-1})^2}{\sum_{t=1}^N e_t^2}\]

Assumptions underlying the test are

  • No intercept
  • AR(1) process, i.e., \(\upsilon_t = \rho \upsilon_{t-1} + \epsilon_t\)
  • No lagged independent variables

Original papers derive lower (\(d_L\)) and upper (\(d_U\)) bounds, i.e., critical values, that depend on \(N\) and \(k\) only.

  • \(d \approx 2 \cdot (1-\rho)\) and since \(-1 \leq \rho \leq 1\), we have \(0 \leq d \leq 4\).

Rule of thumb indicates that \(d=2\) signals no problems.

14.4.2 Breusch-Godfrey Test

Consider the following model \(y_t = \beta_0 + \beta_1 x_t + \upsilon_t\) with the following error term structure: \[\upsilon_t = \rho_1 \upsilon_{t-1} + \rho_2 \upsilon_{t-2} + \dots + \rho_p \upsilon_{t-p} + \epsilon_t\] The null hypothesis for the test is expressed as follows:

  • \(H_0\): \(\rho_1 = \rho_2 = \dots = \rho_p =0\)

When the following regression is executed:

\[\hat{\upsilon}_t = \alpha_0 + \alpha_1 \cdot x_t + \hat{\rho}_1 \cdot \hat{\upsilon}_{t-1} + \hat{\rho}_2 \cdot \hat{\upsilon}_{t-2} + \dots + \hat{\rho}_p \cdot \hat{\upsilon}_{t-p} + \epsilon_t\] Then \[(n-p) \cdot R^2 \sim \chi^2_p\]