7.3 Estimation and Estimators
Assume the simple setting of where the random variable \(X\) represents a population with probability density function \(f(x;\theta)\). This probability density function depends on the parameter \(\theta\) which is unknown. It is assumed that the probability density function is known except for the parameter \(\theta\). So we have to engage in sampling from the population to get information about the parameter of interest. To gather information about \(\theta\), we perform random sampling and collect \(X_1, X_2, \dots, X_n\) which represents an identically and identically distributed (i.i.d.) random sample drawn from the probability density function \(f(x;\theta)\). In the most abstract formulation, we can think of \(W\) as an estimator for the parameter \(\theta\) in as \(W=h(X_1, X_2, \dots X_n)\).
The goal of a good estimation procedure is to determine an unbiased point estimator \(\hat{\theta}\) of the population parameter \(\theta\). To evaluate our estimation procedure, we are interested in analyzing the sampling distribution associated with \(\hat{\theta}\). From a theoretical perspective, the bias \(b\) of an estimator can be expressed as \[b(\hat{\theta})=E(\hat{\theta})-\theta\] The mean squared error of a point estimator W is defined as \[MSE(\hat{\theta})=E((\hat{\theta}-\theta)^2)\] Note, that unbiased does not refer to the fact that \(\hat{\theta}=\theta\) from one sample but if we were able to repeat the sampling infinite times, the average of the \(\hat{\theta}\) would be equal to \(\theta\). Some commonly used statistics are the sample mean, the sample variance, the sample standard deviation, and the sample mid-range. The sample mean is the arithmetic average of the values in a random sample. It is denoted \[\bar{X}(X_1,X_2, \cdot \cdot \cdot, X_n)=\frac{X_{1}+ X_2+ . . . +X_n}{n}=\frac{1}{n}\sum_{i=1}^{n}X_i\] The observed value of \(\bar{X}\) in any sample is denoted by the lower case letter, i.e., \(\bar{x}\). The sample variance is the statistic defined by \[S^2(X_1,X_2, \cdot \cdot \cdot, X_n)=\frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^{2}\] The observed value of \({S^2}\) in any sample is denoted by the lower case letter, i.e., \({s^2}\). The sample standard deviation is the statistic defined by \(S=\sqrt{S^2}\). The sample mid-range is the statistic defined by \[\frac{max(X_1,X_2, \cdot \cdot \cdot ,X_n)-min(X_1,X_2, \cdot \cdot \cdot,X_n)}{2}\] Imagine that you have two estimation methods for the estimator \(\hat{\theta}\). Which one do you chose? The answer is that you would chose the most efficient estimator based on the sampling variance.