CONDUCTING TWO-WAY ANALYSIS OF VARIANCE

  This section briefly  introduces  a  simple  method  for  using
multiple  regression as a means of conducting two-way analysis of 
variance.  If you are not familiar with the use of regression  to
conduct  ANOVA  you should review the instructional materials for
'One-Way Analysis of Variance' before proceeding. 

  The  conduct  of  two-way  ANOVA is largely an extension of the 
technique described for one-way ANOVA.  The essential  difference
is  that  you  now have one continuous dependent variable and two 
classification  variables.   You  therefore   must   create   new
variables  for  each  of  the  subclasses  within each of the two
classification variables.  This will be illustrated with a simple
example.

  Suppose you have a set of depression scores, as before, but you 
now have two classification variables.  One  represents  'Marital
Status' and the other represents 'Social Class' as follows.


    Depression           Marital            Social
      Score              Status             Class

       23                Single             Lower
       22                Single             Middle
       24                Single             Lower
       20                Single             Middle
       26                Single             Upper
       11                Married            Lower
       14                Married            Middle
       15                Married            Upper
       38                Divorced           Lower
       34                Divorced           Middle
       37                Divorced           Upper
       11                Other              Lower
       12                Other              Middle
       09                Other              Middle
       13                Other              Upper


  As  before,  you must create one new variable for each subclass 
of each of the two classification variables.   That  is  done  as
shown below.

   MS1 = Single         SC1 = Lower
   MS2 = Married        SC2 = Middle
   MS3 = Divorced       SC3 = Upper
   MS4 = Other 

  The  next  step  is  to code each person as 0 if they are NOT a
member of the subclass and code the person as 1 to show that they 
are a member of the subclass.  When that is done for each  person
in the sample we obtain the data shown as follows:


    Depression   Marital Status Variables     Social Class
      Score      MS1    MS2    MS3    MS4    SC1   SC2   SC3

       23         1      0      0      0      1     0     0
       22         1      0      0      0      0     1     0
       24         1      0      0      0      1     0     0
       20         1      0      0      0      0     1     0
       26         1      0      0      0      0     0     1
       11         0      1      0      0      1     0     0
       14         0      1      0      0      0     1     0
       15         0      1      0      0      0     0     1
       38         0      0      1      0      1     0     0
       34         0      0      1      0      0     1     0
       37         0      0      1      0      0     0     1
       11         0      0      0      1      1     0     0
       12         0      0      0      1      0     1     0
       09         0      0      0      1      0     1     0
       13         0      0      0      1      0     0     1
 

  Now   that   we  have  created  the  basic  data  for  the  two 
classification variables, we must once again delete  one  of  the
subclass  variables.   However, that must be done for each of the 
two classification variables.  As  before,  it  does  not  matter
which of the sublcass variables is deleted within each of the two
classification variables.

  For purposes of illustration, we shall delete the last variable 
in each of the classification  variables,  and  the  results  are
shown below.


    Depression
      Score    MS1  MS2  MS3  SC1 SC2

       23       1    0    0    1   0
       22       1    0    0    0   1
       24       1    0    0    1   0
       20       1    0    0    0   1
       26       1    0    0    0   0
       11       0    1    0    1   0
       14       0    1    0    0   1
       15       0    1    0    0   0
       38       0    0    1    1   0
       34       0    0    1    0   1
       37       0    0    1    0   0
       11       0    0    0    1   0
       12       0    0    0    0   1
       09       0    0    0    0   1
       13       0    0    0    0   0
 

  Now  that  we  have  constructed  a  suitable  set  of subclass 
variables for each of the two classification variables, we  could
set about to analyze the data as shown above.  Unfortunately, the
values  of  MS1  throuch  MS3 represent only the 'main effect' of 
marital status and SC1 and SC2 represent only the  'main  effect'
of social class. 

  We have not accounted for  the  possibility  of  an  interation
between  marital  status  and  social class.  Such an interaction
must be described with yet additional variables.  Fortunately, it
is easy enough to construct them.

  In order to construct the variables  needed  to  represent  the
interaction  between  two classification variables, you need only
follow a simple rule.  Multiply each of the retained variables in 
the first 'factor' (marital  status)  by  each  of  the  retained
variables  in  the  second 'factor' (social class) to produce, in
this example,

   M1xS1 = MS1 * SC1
   M2xS1 = MS2 * SC1
   M3xS1 = MS3 * SC1

   M1xS2 = MS1 * SC2
   M2xS2 = MS2 * SC2
   M3xS2 = MS3 * SC2 

  If  that is done for each case in the sample we obtain the data
shown as follows:


Score    MS1 MS2 MS3 SC1 SC2 M1xS1 M2xS1 M3xS1 M1xS2 M2xS2 M3xS2

 23       1   0   0   1   0    1     0     0     0     0     0
 22       1   0   0   0   1    0     0     0     1     0     0
 24       1   0   0   1   0    1     0     0     0     0     0
 20       1   0   0   0   1    0     0     0     1     0     0
 26       1   0   0   0   0    0     0     0     0     0     0
 11       0   1   0   1   0    0     1     0     0     0     0
 14       0   1   0   0   1    0     0     0     0     1     0
 15       0   1   0   0   0    0     0     0     0     0     0
 38       0   0   1   1   0    0     0     1     0     0     0
 34       0   0   1   0   1    0     0     0     0     0     1
 37       0   0   1   0   0    0     0     0     0     0     0
 11       0   0   0   1   0    0     0     0     0     0     0
 12       0   0   0   0   1    0     0     0     0     0     0
 09       0   0   0   0   1    0     0     0     0     0     0
 13       0   0   0   0   0    0     0     0     0     0     0 
 

  If you wish to analyze these data, choose the raw  data  option
and  then  use  the file on the SPPC Disk 1 diskette that has the 
name TWOWAYR.DAT.  You can also choose the  summary  data  option
and use the file named TWOWAYS.DAT. 

  Partial results for this analysis are shown below along with  a
brief interpretation of them.


              SIMULTANEOUS MULTIPLE REGRESSION RESULTS

                Raw Score b-  Standardized
Variable Name   Coefficients        Beta      t-ratio      p <=
--------------  ------------  ------------   ----------  --------
     Intercept      13.00000
           MS1      13.00000       0.65189      6.01783    0.0078
           MS2       2.00000       0.08510      0.92582    0.5749
           MS3      24.00000       1.02120     11.10980    0.0012
           SC1      -2.00000      -0.10029     -0.92582    0.5749
           SC2      -2.50000      -0.13028     -1.33631    0.2738
         M1xS1      -0.50000      -0.01808     -0.17496    0.8656
         M2xS1      -2.00000      -0.05306     -0.65465    0.5613
         M3xS1       3.00000       0.07960      0.98198    0.5996
         M1xS2      -2.50000      -0.09040     -0.94491    0.5834
         M2xS2       1.50000       0.03980      0.52489    0.6363
         M3xS2      -0.50000      -0.01326     -0.17496    0.8656


  The raw-score b-coefficients are interpreted in the same manner
as  in  the  one-way  ANOVA  but there are some differences.  For 
example, since the 'other' marital status subclass  variable  was
eliminated,  that  causes the b-coefficients for 'Marital Statis' 
to be defined as contrasts in terms of the mean depression  score
for MS4.  That is,

   b1 = MS1 - MS4
   b2 = MS2 - MS4, and
   b3 = MS3 - MS4

  The  coefficients  for  'Social Class' are defined as contrasts 
with the mean score for SC3 since that was the subclass  variable
omitted from the model.  Thus,

   b4 = SC1 - SC3, and
   b5 = SC2 - SC3


  If you will compute the subclass mean depression scores for the
two  ways  of classifying the individuals in this sample you will 
discover that the above b-coefficients,  or  contrasts,  are  not
exactly  equal  to  the subclass mean differences as was the case 
with the one-way ANOVA example.  The reason for that arises  from
the  fact  that  each  cell  in the design does not have the same
number of cases. 

  This may seem to be a disadvantage.   Actually,  what  is  seen
here  is  an  approach  to  factorial  analysis  of variance that 
provides a simple and powerful  means  of  dealing  with  designs
having  unequal  and  disproportionate  numbers  of  cases in the 
subclasses.  For an important and extensive  discussion  of  this
topic see Chapter 5 of the Cohen & Cohen text.  

  The  ANOVA  table  for  this example is presented below.  It is 
important to note the hierarchical step-down F-ratios  and  their
associated  probabilities.   In  other  words, it is necessary to 
interpret the model in an hierarchical manner  whenever  one  has
unequal and disproportionate subclass sample sizes.

  Again, it is urged that the unfamiliar user consult discussions
such as those found in Chapter 5 of the Cohen & Cohen text.


                       ANALYSIS OF VARIANCE

                                             Hierarchical
 VARIANCE             SUM OF         MEAN      Step-down
  SOURCE        df   SQUARES       SQUARES      F-ratio     p <=
-------------  ---- ------------  ----------  -----------  ------
          MS1    1     43.20000     43.20000     18.51430  0.0212
          MS2    1    157.73300    157.73300     67.60000  0.0030
          MS3    1   1078.58000   1078.58000    462.25000  0.0002
For this set:    3   1279.52000    426.50600    182.78800  0.0008

          SC1    1      0.00211      0.00211      0.00090  0.9765
          SC2    1     21.50360     21.50360      9.21584  0.0545
For this set:    2     21.50570     10.75290      4.60837  0.1221

        M1xS1    1      0.69122      0.69122      0.29623  0.6248
        M2xS1    1      8.14787      8.14787      3.49194  0.1580
        M3xS1    1      3.90516      3.90516      1.67364  0.2865
        M1xS2    1      3.73333      3.73333      1.60000  0.2955
        M2xS2    1      1.02857      1.02857      0.44081  0.5562
        M3xS2    1      0.07142      0.07142      0.03061  0.8656
For this set:    6     17.57760      2.92960      1.25554  0.4594

        Error    3      7.00000      2.33333
        Total   14   1325.60000


  Space  does  not  allow  a  more  extended  discussion  of  the 
interpretation of hierarchical analysis.   However,  any  two-way
crossed  factorial  ANOVA  can be constructed by use of the above 
procedures.  It does not matter how many subclasses  are  present
within each of the two classification variables for such designs.

  The present limits of the program are such that you may have no
more  than  50 retained subclass and interaction variables in any
design or regression analysis.
