18.3 Fixed Effects Panel Data Model
The two sections on fixed and random effects panel data model use a very old data set but which is standard in all panel data texts. The data set is called grunfeld
and is part of the package plm. The data contains the following variables of 10 companies over the period 1935 to 1954:
- \(inv\): Investment
- \(value\): Value of the firm
- \(capital\): Capital stock
The fixed effects model or Least-Squares Dummy Variable (LSDV) regression model assumes constant slope coefficients but varying intercepts over \(i\). The regression equation can be written as: \[inv_{it} = \beta_{0i} + \beta_1 \cdot value_{it} + \beta_2 \cdot capital_{it}\] where \(i\) and \(t\) represent the firms and time, respectively. This model can also be written as \[inv_{it} = \alpha_0+\alpha_1 \cdot D_{1i} + \alpha_2 \cdot D_{2i} +\alpha_3 \cdot D_{3i} + \beta_1 \cdot value_{it} + \beta_2 \cdot capital_{it}\] Individual specific effects: \[y_{it} = \alpha_i + \beta_i \cdot x_{it} + \epsilon_{it}\] where \(\alpha_i\) can be fixed or random. The companies of interest for this chapter are GM (firm 1), U.S. Steel (firm 2), GE (firm 3), and Westinghouse (firm 8).
In a first step, a pooled model is executed, i.e., all cross-sectional and time series observations are combined into a single data set. \[inv_i = \beta_0 + \beta_1 \cdot value_i + \beta_2 \cdot capital_i\] The general formulation of the pooled model: \[y_{it}=\beta_0+\beta_1 \cdot x_i + \epsilon_i\] There are multiple issues associated with a pooled OLS model:
- Ignores heterogeneity among the observations and time.
- Presence of heterogeneity: Correlation between independent variables and error term leads to biased and inconsistent coefficient estimates. Solution: Fixed effects model takes heterogeneity into account. \(\Rightarrow\) Autocorrelation between error terms. Fix: Random effects model
To use the functions from plm, define the data as a panel data set:
There are two possibilities to execute a pooled OLS Model. Use the regular lm()
function or use plm()
specifying the model as “pooling”. The outputs will be names grunwald.ols
and grunwald.pooling
.
## Pooling Model
##
## Call:
## plm(formula = inv ~ value + capital, data = grunfeld, model = "pooling")
##
## Balanced Panel: n = 4, T = 20, N = 80
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -319.6766 -99.9523 1.9647 65.9905 336.2072
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) -62.831841 29.725385 -2.1137 0.03778 *
## value 0.110521 0.013776 8.0230 9.186e-12 ***
## capital 0.300463 0.049399 6.0823 4.273e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 6410400
## Residual Sum of Squares: 1572700
## R-Squared: 0.75466
## Adj. R-Squared: 0.74829
## F-statistic: 118.424 on 2 and 77 DF, p-value: < 2.22e-16
Fixed effects model
- Intercept \(\beta_{0i}\) is firm specific.
- For an individual, this could be education and/or ability, possibly correlated with independent variables
- Intercept is time-invariant.
- Slope coefficients do not vary across individuals (firms) or time
To implement the model in R, the function plm()
must be used specifying the model as “within”. The output will be names grunwald.fixed
.
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = inv ~ value + capital, data = grunfeld, model = "within")
##
## Balanced Panel: n = 4, T = 20, N = 80
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -184.6581 -48.2612 9.3252 40.5471 197.6681
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## value 0.108400 0.017566 6.1711 3.3e-08 ***
## capital 0.345058 0.026708 12.9195 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 2171500
## Residual Sum of Squares: 422220
## R-Squared: 0.80556
## Adj. R-Squared: 0.79242
## F-statistic: 153.291 on 2 and 74 DF, p-value: < 2.22e-16
Use fixef(grunfeld.fixed)
to get the firm specific intercepts. The function pFtest()
can be used to test whether a fixed effects or OLS is appropriate (H\(_0\): OLS better). If the significance level is set to \(\alpha=0.05\) then the fixed effects model is a better choice if the p-value is below 0.05:
##
## F test for individual effects
##
## data: inv ~ value + capital
## F = 67.215, df1 = 3, df2 = 74, p-value < 2.2e-16
## alternative hypothesis: significant effects
The fixed effects model can also be implemented using the function lm()
:
##
## Call:
## lm(formula = inv ~ value + capital + factor(firm), data = grunfeld)
##
## Coefficients:
## (Intercept) value capital factor(firm)2 factor(firm)3 factor(firm)8
## -85.5153 0.1084 0.3451 180.5029 -160.7122 26.1296