                                 PROPCI
    A Program that Calculates Confidence Intervals for Proportions
                      Version 1.2, January 8, 1990
                          (c) 1988, 1989, 1990

                                    
                                   by
                           Kevin M. Sullivan
                         Division of Nutrition
       Center for Chronic Disease Prevention and Health Promotion
                      Centers for Disease Control
                      1600 Clifton Road NE, MS A08
                           Atlanta, GA  30333

   This program  was developed  to calculate  confidence intervals for a
proportion by  use of the following methods: the normal approximation to
the binomial,  the normal  approximation with  a correction  factor, the
method by  Wilson, the  quadratic method, the exact binomial method, and
the mid-p  (Miettinen) method.   Either 90, 95, or 99 percent confidence
intervals can  be calculated.  The user inputs the number of individuals
who have  the event  of interest (x) and the sample size (n).  The point
estimate is x/n.  The standard error (SE) of the normal approximation to
the binomial(1,2) is:
         ____
  SE = \/pq/n

  where
   p=x/n
   q=1-p
   n=denominator

The 95% confidence interval for the normal approximation is:

  p + Z * SE

  where
   Z = Z value, e.g., for 95% two-sided CI this is 1.96
   SE = the standard error calculated above

   The lower and upper confidence limits for a normal approximation with
a correction factor2 are:

  p - Z * SE - 1/(2n)

and

  p + Z * SE + 1/(2n)

   The normal  approximation  to  the  binomial  can  produce  estimates
outside of the 0-100 percent limits.  PROPCI will provide the results of
the normal  approximation calculations even if the estimates are outside
the limits,  although most authors truncate the estimates to the limits.
One suggested criterion for determining when the normal approximation is
inappropriate is when npq<5.(2)  A  more correct  approximation for  the
confidence interval  for a  proportion is calculated using the quadratic
method.1   This quadratic  formula includes  a correction  factor.   The
formula for the lower bound of the quadratic method is:


                              ____________________________
                 2           |  2
         (2np + Z  - 1) - Z \| Z  - (2 + 1/n) + 4p(nq + 1)
         _________________________________________________
                                   2
                            2(n + Z )
     
     and the upper bound is:

                              ____________________________
                 2           |  2
         (2np + Z  + 1) + Z \| Z  + (2 - 1/n) + 4p(nq - 1)
         _________________________________________________
                                   2
                            2(n + Z )
     
  Another approximate method is by Wilson.(3) It appears that the method
by Wilson  is a  quadratic equation  without the correction factor.  The
formula for the lower bound is:
                 _            _____________ _
                |        2   |            2  |
           n    |  x    Z    | x(n-x)    Z   |
         ______ | ___ + __   | ______ - ___  |
             2  |            |    3       2  |
          n+Z   |_ n    2n  \|   n      4n  _|
     
     and the upper bound is:
                 _            _____________ _
                |        2   |            2  |
           n    |  x    Z    | x(n-x)    Z   |
         ______ | ___ + __   | ______ + ___  |
             2  |            |    3       2  |
          n+Z   |_ n    2n  \|   n      4n  _|
     
   The  exact  binomial  confidence  interval  is  calculated  by  using
formulas as described by Rosner(2) and Rothman.(3)  The formulas for the
lower and  upper limits  for a  two-sided 95% confidence interval (i.e.,
.025 in each tail) are:
             n
            ___     n!     k      n-k
     .025 = \    -------- p (1-p )
            /    k!(n-k)!  1    1
            ---
            k=x
     
             x
            ___     n!     k      n-k
     .025 = \    -------- p (1-p )
            /    k!(n-k)!  2    2
            ---
            k=0
     
   Exact mid-p  (Miettinen) confidence intervals are calculated by using
formulas as described by Rothman.(3) The formula for the lower and upper
limits for a two-sided 95% confidence interval (i.e., .025 in each tail)
are:


                                        n
            1      n!     x      n-x   ___     n!     k      n-k
     .025 = - * -------- p (1-p )    + \    -------- p (1-p )
            2   x!(n-x)!  1    1       /    k!(n-k)!  1    1
                                       ---
                                      k=x+1
     
                                       x-1
            1      n!     x      n-x   ___     n!     k      n-k
     .025 = - * -------- p (1-p )    + \    -------- p (1-p )
            2   x!(n-x)!  2    2       /    k!(n-k)!  2    2
                                       ---
                                       k=0
     
   For each  proportion, the  normal  approximation  (with  and  without
correction factor),  Wilson,  and  quadratic  confidence  intervals  are
automatically provided.  If npq<5, a message is provided near the bottom
of the  screen warning  users that  the normal  approximation may not be
appropriate.   Next, the  user is then prompted as to whether they would
like to  have exact  confidence intervals  calculated  (the  default  is
"no").   Both the  exact binomial  and mid-p  formulas require iterative
solutions to  determine the  value of  lower and upper confidence limits
and therefore  are not  automatically  performed.    A  fast  method  to
determine the  exact confidence  intervals using  the F-distribution  is
used.(3)  Finally, the user is  asked whether they would like to perform
another calculation or to return to DOS.
   Which confidence  interval method should you use?  My opinion is that
among the  approximate methods  (i.e., normal approximation, Wilson, and
quadratic), the  quadratic provides  the  best  estimate  of  the  exact
binomial confidence  interval.   If the data are sparse, then use one of
the exact methods (exact binomial or mid-p).
     
EXAMPLE
     
   In this example from Rothman,(3) x=10  and n=11  with 90%  confidence
intervals.
     
 +--------------------------------------------------------------------+
 | 01/08/90               **  PROPCI 1.2  **                          |
 +--------------------------------------------------------------------+
 
           Numerator: 10         / Denominator: 11
   
           Enter two-sided confidence level (90, 95, or 99%):  90
      
           The point estimate is:  90.909%
      
 Confidence Interval Method                Std Error       90% CI
 Normal Approx. to the Binomial               8.668    76.650, 105.168
 Normal Approx. with Correction Factor        8.668    72.105, 109.713
 Wilson Method                                         67.719,  97.945
 Quadratic Method                                      62.330,  99.372
 Exact Binomial                                        63.564,  99.535
 Miettinen Limits (Mid-p)                              67.759,  99.090
      
     **The normal approximation may not be valid for this example**
               Would you like to do another? (Y/N) Y
      


     
     
DIFFERENCES BETWEEN VERSION 1.2 AND PREVIOUS VERSIONS
     
   Version 1.2  implements the  F-distribution method  to arrive  at the
exact confidence limits.  This dramatically reduces the computation time
involved compared  to other  iterative procedures and produces the exact
same results.
     
DISTRIBUTION CONDITIONS
     
   NON-WARRANTY.   PROPCI is  provided "as  is" and without any warranty
expressed or  implied.  The user assumes all risks of the use of PROPCI.
PROPCI may  not run  on your particular hardware/software configuration.
We bear  no responsibility  for any  mishap or  economic loss  resulting
therefrom the use of this software.
   COPYRIGHT CONDITIONS.   You  may make and distribute copies of PROPCI
provided that there is no material gain involved.
   USE AT YOUR OWN RISK.  All risk of loss of any kind due to the use of
PROPCI is with you, the user.  You are responsible for all mishaps, even
if the  program proves  to be  defective.   This program  makes  certain
assumptions about  the data.   These  assumptions affect the validity of
conclusions made based on the output from this program.
     
   Please  acknowledge   PROPCI  in   any  manuscript   that  uses   its
calculations.
     
REFERENCES
     
1.   Fleiss JL.   Statistical Methods for Rates and Proportions, 2nd Ed.
     John Wiley & Sons, New York, 1981.
2.   Rosner B.   Fundamentals  of Biostatistics.  Duxbury Press, Boston,
     1982.
3.   Rothman  KJ,   Boice  JD   Jr:     Epidemiologic  analysis  with  a
     programmable calculator.   NIH  Pub No.  79-1649.    Bethesda,  MD:
     National Institutes of Health, 1979;31-32.
