8.2 Confidence Interval for the Mean

To construct a confidence interval for a population mean, the sample standard deviation must be estimated. The sample variance is denoted with \(s^2\) and the standard error (S.E.) is denoted with \[S.E. = \frac{s}{\sqrt{n}}\] In most cases, \(\sigma^2\) is unknown. If \(\sigma^2\) was known, the normal distribution could be used. Because we rely on an estimate of the variance, we have to correct for the errors associated with this estimation and we need to use the t-distribution. The t-score is similar to the z-score in that it comes from a bell-shaped curve but the tails are thicker.

Consider the data in eggweights. To construct the confidence interval, the sample mean and sample standard deviation are required. Calculating those values leads to \(\bar{x}=61.05\) and \(s=4.46\). So the standard error is \[S.E. = \frac{4.46}{\sqrt{61}}=0.57\] For \(n=61\) and \(df=n-1=60\), we find the confidence interval of \(61.05 \pm 2.0003 \cdot 0.57\).

nobs           = nrow(mh2)
meandata       = mean(mh2$price)
stdev          = sd(mh2$price)
t_alpha_df     = qt(0.975,nobs-1)
CI_lower       = meandata-t_alpha_df*stdev/sqrt(nobs)
CI_upper       = meandata+t_alpha_df*stdev/sqrt(nobs)
t.test(mh2$price)
## 
##  One Sample t-test
## 
## data:  mh2$price
## t = 6.6926, df = 17, p-value = 3.783e-06
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##   564678.8 1084609.0
## sample estimates:
## mean of x 
##  824643.9