4.4 Empirical Cumulative Distribution Function

The empirical cumulative distribution function can be written as \[F_n(x) = \frac{\text{number of elements} \leq x}{n}\] Given the previously mentioned Old Faithful data, the empirical cumulative distribution function is shown shown below. An important measure of empirical cumulative distribution functions are quantiles. Quantiles are the values that cut x% of the distributional area. Below the quantile is a certain share of all values in a set of ordered observations. If the distribution is divided in \(n\) equal shares, there are n-1 quantiles. Commonly used quantiles are quartiles, quintiles, deciles, and percentiles. Quartiles divide the sample into 4 equal shares, i.e., 3 quartiles: 25%-, 50%-, 75%- quartile. Quintiles divide the sample into 5 equal shares, deciles into 10 equal shares, and percentiles divide the sample into 100 equal shares. Closely related to quartiles is the interquartile range (IQR).

faithful = data.frame(faithful)
par(mfrow=c(1,2))
     plot(ecdf(faithful$eruptions),main="Eruption Time",xlab="Minutes",xlim=c(1,6),lty=1)
     plot(ecdf(faithful$waiting),main="Waiting Time",xlab="Minutes",xlim=c(40,100),lty=1)
Empirical cumulative distribution functions of eruption and waiting time of Old Faithful geyser

Figure 4.3: Empirical cumulative distribution functions of eruption and waiting time of Old Faithful geyser