




                           CHAPTER  5

                       SUMMARY STATISTICS



  The summary statistics module of the SPPC System enables you to 
produce a variety of elementary descriptive statistics  for  each
variable  in a raw data file.  It allows you to enter the name of 
your  raw  data  file,  use  all  variables  or  select  specific
variables  from  your  data  file, and then produce the following
statistics for each of the variables you've selected or retained.

     Summary Statistics: Variable Name

     N 
     Mean             Std Dev        
     Std Err Mean     Biased Std Dev 
     Skewness         Kurtosis       
     Minimum          Maximum        
     Range            Coeff of Variation   
     Geometric mean   Harmonic mean  

     Sum X    
     Sum X^2            Sum x^2     Note: x = (X - Mean X)
     Sum X^3            Sum x^3  
     Sum X^4            Sum x^4  

     Missing values code


                       USING THE PROCEDURE

  The  summary statistics module of the SPPC is very easy to use. 
The first thing to do is select the "Summary  Statistics"  option
shown  in the Master Menu of the SPPC program.  When you do that, 
the summary statistics procedure will then be  called  into  your
computer memory.

  Once  the module is loaded into memory you will see the opening 
menu which asks if  you  would  like  to  compute  geometric  and
harmonic  means.   This  choice  is  provided  only because it is 
relatively rare that  one  needs  such  means.   Moreover,  their
computation  is  time  consuming  on systems that do not have the
high speed math co-processor.

  The module will then ask you to enter the name of the data file
that  you  wish  to use.  When entering the file name, be sure to 
include the disk drive and any necessary directory  paths.   This
feature  of  the program means that you may fetch data files from
any available diskette or any defined area on any hard disk.

  Just  in case you are using data files stored on diskettes, the 
module will instruct you to insert  your  data  diskette  in  the
drive  which  you indicated when you provided the "filename".  If
your data file resides on a hard disk, just press the space bar.  

  The  module  will  then  search  for your raw data file.  If it 
cannot find it in the location you  specified,  you  will  be  so
notified  and then given another opportunity to enter the name of 
your file.  If your data file is found, it will be described  for
you  on  screen.   You must then indicate whether you wish to use
the file that is described.

  Once you have indicated that you will use the located data file
the  module will ask if you wish to delete any variables.  If you
indicate that you do not wish to do that, the module will provide 
summary statistics for all of your variables.   If  you  indicate
that  you  wish to delete variables, the module will then present 
the name of each variable in your data file and ask  whether  you
wish to RETAIN it.  

  When you have finished the selection of  variables  the  module
will  ask  you  to enter the value that will be used to check for 
"missing values" for each variable within your  data  file.   The
system  allows  only one missing values code and it asks that you
confirm the one you entered.

  None  of  the  summary statistics will be sent directly to your 
printer.  However, all of your results are saved for you  as  you
work  along.   Then, when you exit the SPPC you will be given the 
option of reviewing your work on screen, printing it, or  storing
it in a disk file of your choice.

  Once  you have indicated the foregoing options and choices, the 
module will begin to read and process your data file.   When  the
entire data file has been read and processed, the system will ask
whether  you  wish  to read more data from another file.  At that 
point  you  may  indicate  your  choice  and  provide  any   file
description  information that the system requires.  It will guide
you completely. 

  Your  final  results  will  now  be presented to you on screen. 
When you are finished  with  the  display  of  your  output,  the
original  summary  statistics screen will reappear.  You may then 
use the procedure again with the same or a different file or  you
may exit and return to the Master Menu of the SPPC program.


                        DATA REQUIREMENTS

  The summary statistics procedure requires the use of a raw data 
file that has been created for use by the SPPC program.   If  you
are  not  familiar  with the structure of such files, please read 
Chapter 2 of the User's Manual.


                         NUMBER OF CASES

  There is no practical limit to the number of cases that you may
have  in a raw data file.  The theoretical limit is 30,000 cases. 
You may fill an entire diskette with a single data file  and  you
may continue a data file onto two or more diskettes.  You may use
as many diskettes as you like.


                         ESTIMATED CASES

  Under  normal  circumstances, the second line of your data file 
will contain the number of cases, the number  of  variables,  and
the  letter  R.   If you show the number of cases to be zero (0), 
the procedure will count the number of cases  in  your  file  and
then  report  that  number to you.  CAUTION: You may NOT estimate
the number of variables.


                       NUMBER OF VARIABLES

  The  maximum  number  of variables that you may have in any raw
data file is 200.  However, if you have more than 50 variables in 
your file, you must  then  select  any  subset  of  50  or  fewer
variables  for actual processing.  The program will automatically 
place you in the "variable selection" mode if you have more  than
50 variables in your file.  


                         SAMPLE OUTPUTS

  The following is sample output that was obtained from the first
three  variables  in  the Longley data.  Those data are stored on
this diskette in the file named LONGLEYR.DAT which you may use as
an exercise to test your use of the summary statistics procedure.


SUMMARY STATISTICS

(1) Variable Name = Employed                                                  
Mean      = 65.31700      Std Dev     = 3.51197       N = 16                  
SE Mean   = 0.87799       Biased S.D. = 3.40045                               
Skew      = -0.09431      Kurtosis    = -1.35148                              
Minimum   = 60.17100      Maximum     = 70.55100                              
Range     = 10.38000      Coeff of var = 65.49403                             
Geometric mean = 65.22806  Harmonic mean  = 65.13879                          
                                                                              
Sum X   = 1045.07200                                                          
Sum X^2 = 68445.97665          Sum x^2 = 185.00883                            
Sum X^3 = 4494794.96843        Sum x^3 = -59.33140                            
Sum X^4 = 295946338.81767      Sum x^4 = 3526.62700                           
                                                                              
Missing value = -1.00000                                                      


(2) Variable Name = GNP_Deflator                                              
Mean      = 101.68125     Std Dev     = 10.79155      N = 16                  
SE Mean   = 2.69789       Biased S.D. = 10.44888                              
Skew      = -0.14640      Kurtosis    = -1.17419                              
Minimum   = 83.00000      Maximum     = 116.90000                             
Range     = 33.90000      Coeff of var = 102.75499                            
Geometric mean = 101.13503  Harmonic mean  = 100.58128                        
                                                                              
Sum X   = 1626.90000                                                          
Sum X^2 = 167172.09000         Sum x^2 = 1746.86437                           
Sum X^3 = 17350841.58300       Sum x^3 = -2672.19977                          
Sum X^4 = 1817971236.89010     Sum x^4 = 348220.26132                         
                                                                              
Missing value = -1.00000                                                      


(3) Variable Name = GNP                                                       
Mean      = 387.69844     Std Dev     = 99.39494      N = 16                  
SE Mean   = 24.84873      Biased S.D. = 96.23873                              
Skew      = 0.02528       Kurtosis    = -1.11807                              
Minimum   = 234.28900     Maximum     = 554.89400                             
Range     = 320.60500     Coeff of var = 411.58787                            
Geometric mean = 375.26336  Harmonic mean  = 362.58960                        
                                                                              
Sum X   = 6203.17500                                                          
Sum X^2 = 2553151.55993        Sum x^2 = 148190.30489                         
Sum X^3 = 1105119772.78482     Sum x^3 = 360602.98270                         
Sum X^4 = 498279105976.54724   Sum x^4 = 2582992122.52588                     
                                                                              
Missing value = -1.00000                                                      


