An Application of the Binomial Distribution to Error Analysis

The binomial distribution is used to analyse the error in experimental results that estimate the proportion of individuals in a population that satisfy a condition of interest.

Example (1994): Suppose the population is women at least 35 years of age who are pregnant with a fetus afflicted by Down syndrome, and the condition of interest is testing positive on a non-invasive screening test for Down syndrome. The experiment is to take a random sample of 54 such women and see how many test positive. Since the experimental result is the number X of subjects that test positive, the sample space is {0,1,…,54}.

In one such experiment completed in 1994, X=48 of 54 women tested positive so we estimate that 48/54 = 89% of women in the population would test positive. How accurate is .89 as an estimate of the true proportion p of women in the population who would test positive?

This question is traditionally answered by giving a 95% confidence interval, .89 ± 1.96σ, where σ is the standard deviation of the estimate X/54 of p. Since the probability distribution of X is Bin(54,p), the standard deviation of X/54 is the square root of p(1-p)/54. Solve

.89 – 1.96 √p(1-p)/54 < p < .89 + 1.96 √p(1-p)/54

for p to obtain the 95% confidence interval

.78 < p < .95 .

Reference (1994): James E. Haddow, Glenn E. Palomaki, George J. Knight, George C. Cunningham, Linda S. Lustig, and Patricia A. Boyd, Reducing the Need for Amniocentesis in Women 35 Years of Age or Older with Serum Markers for Screening, New England Journal of Medicine, Volume 330:1114-1118, April 21, 1994, Number 16.

Note: This confidence interval is not symmetric about the estimate .89, since the binomial distribution is not symmetric about its mean. It is used in Richard Ball’s discussion of issues relevant to women who might consider a more invasive test for Down syndrome. It was derived by Lynne Butler using the procedure discussed in Chapter 7 of Devore’s text Probability and Statistics for Engineering and the Sciences.

Reference for confidence interval calculation: Alan Agresti and Brent A. Coull, Approximate is Better than “Exact” for Interval Estimation of Binomial Proportion, The American Statistician 1998: 52: 119-126.

Example (2005): The data used in the above example are from a study completed in 1994. In November 2005, the results of another study using a combined screening test were published. Out of a population of 64 women at least 35 years of age pregnant with a fetus afflicted by Down Syndrome, 61 tested positive. We therefore estimate that 61/64 ≅ 95% of women in the population would test positive. To determine the accuracy of .95 as an estimate of the true proportion p of women in the population who would test positive, we calculate the 95% confidence interval in the same way as in the above example. Solve

.95 – 1.96 √p(1-p)/64 < p < .95 + 1.96 √p(1-p)/64

for p to obtain the 95% confidence interval

.87 < p < .98 .

Reference (2005): Fergal D. Malone, M.D., Jacob A. Canick, Ph.D., Robert H. Ball, M.D., David A. Nyberg, M.D., Christine H. Comstcok, M.D., Radek Bukowski, M.D., Richard L. Berkowitz, M.D., Susan J. Gross, M.D., Lorraine Dugoff, M.D., Sabrina D. Craigo, M.D., Ilan E. Timor-Tritsch, M.D., Alicja R. Rudnicka, Ph. D., Allan K. Hackshaw, M.Sc., Geralyn Lambert-Messerlian, Ph.D., Nicholas J. Wald, F.R.C.P., and Mary E. D’Alton, M.D., for the First- and Second-Trimester Evaluation of Risk (FASTER) Research Consortium, First-Trimester or Second-Trimester Screening, or Both, for Down’s Syndrome, New England Journal of Medicine, Volume 353:2001-2011, November 10, 2005, Number 19.