14.4 Autocorrelation

The correlation of error terms is called autocorrelation. The issue usually arises if there is a time component in the data. Recall the main types of data available for research:

Cross-sectional data (multiple observations at same time point)
Time series data (one variable observed over time)
Pooled data (multiple observations at different time points)
Panel data (same observations at different time points)

There is a distinction between serial correlation and autocorrelation:

Serial correlation: Correlation between two series
Autocorrelation: Correlation with lagged variables

The OLS estimator is still unbiased but there is no longer minimum variance since \(E(\epsilon_i \epsilon_j) \neq 0\). Autocorrelation is unlikely for cross-sectional data except in the case of spatial auto-correlation. One cause of autocorrelation could be inertia in economic variables. For example, variables such as income, production, or employment increase after a recession. But there are a number of other reasons for autocorrelation.

Autocorrelation could be caused by specification bias due to excluded variables or incorrect functional forms. For example, assume that the correct equation is

\[q_{beef}=\beta_0+\beta_1 \cdot p_{beef}+\beta_2 \cdot p_{income} + \beta_3 \cdot p_{pork}+\epsilon_t\] The estimated equation is: \[q_{beef}=\beta_0+\beta_1 \cdot p_{beef}+\beta_2 \cdot p_{income}+\upsilon_t\]

The error terms in both equations are denoted \(\epsilon_t\) and \(\upsilon_t\), respectively. This results in a systematic patters of \(\upsilon_t\): \[\upsilon_t= \beta_3 \cdot p_{pork}+\epsilon_t\] Correlation between the error terms can also be caused by specifying an incorrect functional form. Assume that the correct equation is written as follows: \[y_i = \beta_0 +\beta_1 \cdot x_i +\beta_2 \cdot x_i^2 +\epsilon_i\] But the estimated equation is \[y_i = \beta_0 +\beta_1 x_i +\epsilon_i\] Serial correlation is caused by lagged terms in the regression equation: \[consumption_t = \beta_0 + \beta_1 \cdot income_t+\beta_3 \cdot consumption_{t-1}+\epsilon_t\] The issues of lagged terms will be covered in the part on dynamic regression and time series and this section serves only as an introduction to first-order autoregressive schemes. Consider the model: \[y_t=\beta_0+\beta_1 \cdot x_t + \upsilon_t\] Assume the following form of \(\upsilon\): \[\upsilon_t = \rho \cdot \upsilon_{t-1} + \epsilon_t\] his is called a first-order autoregressive AR(1) scheme. An AR(2) would be written as \[\upsilon_t = \rho_1 \cdot \upsilon_{t-1} + \rho_2 \cdot \upsilon_{t-2} + \epsilon_t\] This can be illustrated with simulated data. Consider the following model: \[y_t=1+0.8 \cdot x_t + \upsilon_t\] and assume the following form of \(\upsilon\): \[\upsilon_t = 0.7 \cdot \upsilon_{t-1} + \epsilon_t\] 1. Simulate the above model 100 times 2. Compare variance of coefficients under different two different methods: (1) OLS and (2) Cochrane-Orcutt

14.4.1 Durbin Watson d-Test

The test statistic of the Durbin-Watson test is written as: \[d=\frac{\sum_{t=2}^N (e_t-e_{t-1})^2}{\sum_{t=1}^N e_t^2}\]

Assumptions underlying the test are

No intercept
AR(1) process, i.e., \(\upsilon_t = \rho \upsilon_{t-1} + \epsilon_t\)
No lagged independent variables

Original papers derive lower (\(d_L\)) and upper (\(d_U\)) bounds, i.e., critical values, that depend on \(N\) and \(k\) only.

\(d \approx 2 \cdot (1-\rho)\) and since \(-1 \leq \rho \leq 1\), we have \(0 \leq d \leq 4\).

Rule of thumb indicates that \(d=2\) signals no problems.

14.4.2 Breusch-Godfrey Test

Consider the following model \(y_t = \beta_0 + \beta_1 x_t + \upsilon_t\) with the following error term structure: \[\upsilon_t = \rho_1 \upsilon_{t-1} + \rho_2 \upsilon_{t-2} + \dots + \rho_p \upsilon_{t-p} + \epsilon_t\] The null hypothesis for the test is expressed as follows:

\(H_0\): \(\rho_1 = \rho_2 = \dots = \rho_p =0\)

When the following regression is executed:

\[\hat{\upsilon}_t = \alpha_0 + \alpha_1 \cdot x_t + \hat{\rho}_1 \cdot \hat{\upsilon}_{t-1} + \hat{\rho}_2 \cdot \hat{\upsilon}_{t-2} + \dots + \hat{\rho}_p \cdot \hat{\upsilon}_{t-p} + \epsilon_t\] Then \[(n-p) \cdot R^2 \sim \chi^2_p\]