8.1 Confidence Interval for a Proportion
Recall from the Bernoulli distribution that \(p+q=1\) and \(\sigma=\sqrt{(p \cdot q)}\). The standard error for the mean is
\[\sigma_{\hat{p}}=\sqrt{\frac{\hat{p} \cdot (1-\hat{p})}{n}}\]
Thus, the 95% confidence interval is constructed as follows:
\[\hat{p} \pm 1.96 \cdot \sigma_{\bar{x}} \Leftrightarrow \hat{p} \pm 1.96 \cdot \sqrt{\frac{\hat{p} \cdot (1-\hat{p})}{n}}\]
Consider the voting data in gss2018
. The mean is 0.67 and the standard error is
\[\sigma_{\bar{x}}=\sqrt{\frac{0.67 \cdot (1-0.67)}{772}}=0.0169\]
In this case, the margin of error is \(1.96 \cdot 0.0169=0.033\). To calculate a confidence interval for a proportion in R, the function t.test()
is used. Before using the function, the manual calculations are presented.
voting = subset(gss,vote12 %in% c("voted","did not vote"))
voting = ifelse(voting$vote12=="voted",1,0)
nobs = nrow(voting)
meandata = mean(voting)
z = qnorm(0.975)
stderror = sqrt(meandata*(1-meandata)/nobs)
CI_lower = meandata-z*stderror
CI_upper = meandata+z*stderror
Using the function t.test()
is simpler:
##
## One Sample t-test
##
## data: voting
## t = 76.794, df = 2608, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 0.6756645 0.7110737
## sample estimates:
## mean of x
## 0.6933691