17.1 Truncation

In the case of truncation, a certain part of the data is not observed. In the graph below, the true parameters are \(\beta_0=-2\) and \(\beta_1=0.5\). Values \(y<0\) are not reported in the data. The green regression line is “correct” whereas the “red” is the line obtained from a regression model which ignores the truncation.

If all the data was observed, the correct regression model would give the following results:

summary(bhatreal)
## 
## Call:
## lm(formula = yreal ~ x, data = truncation)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0300 -0.6778 -0.1484  0.7101  2.0034 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.95643    0.27902  -7.012 7.05e-09 ***
## x            0.51658    0.05071  10.188 1.37e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.048 on 48 degrees of freedom
## Multiple R-squared:  0.6838, Adjusted R-squared:  0.6772 
## F-statistic: 103.8 on 1 and 48 DF,  p-value: 1.372e-13

The estimates are biased if truncation is ignored:

summary(bhattruncated)
## 
## Call:
## lm(formula = yobs ~ x, data = truncation)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1092 -0.5793 -0.2110  0.5564  1.6747 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.84279    0.62042  -1.358 0.185998    
## x            0.38663    0.08905   4.342 0.000191 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9732 on 26 degrees of freedom
##   (22 observations deleted due to missingness)
## Multiple R-squared:  0.4203, Adjusted R-squared:  0.398 
## F-statistic: 18.85 on 1 and 26 DF,  p-value: 0.0001909

To correct for the truncation, use the functions from the package truncreg which allows to reduce the bias of the coefficients:

bhatcorrect = truncreg(yobs~x,data=truncation)
summary(bhatcorrect)
## 
## Call:
## truncreg(formula = yobs ~ x, data = truncation)
## 
## BFGS maximization method
## 31 iterations, 0h:0m:0s 
## g'(-H)^-1g = 2.7E-12 
##  
## 
## 
## Coefficients :
##             Estimate Std. Error t-value  Pr(>|t|)    
## (Intercept) -3.03940    1.51806 -2.0022  0.045267 *  
## x            0.64446    0.18585  3.4676  0.000525 ***
## sigma        1.13419    0.22654  5.0066 5.541e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -32.986 on 3 Df