May 22, 2012

Conduct and Interpret a Sequential One-Way Discriminant Analysis

What is the Sequential One-Way Discriminant Analysis?

Sequential one-way discriminant analysis is similar to the one-way discriminant analysis.  Discriminant analysis predicts group membership by fitting a linear regression line through the scatter plot.  In the case of more than two independent variables it fits a plane through the scatter cloud thus separating all observations in one of two groups –one group to the “left” of the line and one group to the “right” of the line.

Sequential one-way discriminant analysis now assumes that the discriminating, independent variables are not equally important.  This might be a suspected explanatory power of the variables, a hypothesis deducted from theory or a practical assumption, for example in customer segmentation studies.

Like the standard one-way discriminant analysis, sequential one-way discriminant analysis is useful mainly for two purposes: 1) identifying differences between groups, and 2) predicting group membership.

Firstly, sequential one-way discriminant analysis identifies the independent variables that significantly discriminate between the groups that are defined by the dependent variable.  Typically, sequential one-way discriminant analysis is conducted after a cluster analysis or a decision tree analysis to identify the goodness of fit for the cluster analysis (remember that cluster analysis does not include any goodness of fit measures itself).  Sequential one-way discriminant analysis tests whether each of the independent variables has discriminating power between the groups.

Secondly, sequential one-way discriminant analysis can be used to predict group membership.  One output of the sequential one-way discriminant analysis is Fisher’s discriminant coefficients.  Originally Fisher developed this approach to identify the species to which a plant belongs.  He argued that instead of going through a whole classification table, only a subset of characteristics is needed.  If you then plug in the scores of respondents into these linear equations, the result predicts the group membership.  This is typically used in customer segmentation, credit risk scoring, or identifying diagnostic groups.

Because sequential one-way discriminant analysis assumes that group membership is given and that the variables are split into independent and dependent variables, the sequential one-way discriminant analysis is a so called structure testing method as opposed to structure exploration methods (e.g., factor analysis, cluster analysis).

The sequential one-way discriminant analysis assumes that the dependent variable represents group membership the variable should be nominal.  The independent variables represent the characteristics explaining group membership.

The independent variables need to be continuous-level(interval or ratio scale).  Thus the sequential one-way discriminant analysis is similar to a MANOVA, logistic regression, multinomial and ordinal regression.  Sequential one-way discriminant analysis is different than the MANOVA because it works the other way around.  MANOVAs test for the difference of mean scores of dependent variables of continuous-level scale (interval or ratio).  The groups are defined by the independent variable.

Sequential one-way discriminant analysis is different from logistic, ordinal and multinomial regression because it uses ordinary least squares instead of maximum likelihood; sequential one-way discriminant analysis, therefore, requires smaller samples.  Also continuous variables can only be entered as covariates in the regression models; the independent variables are assumed to be ordinal in scale.  Reducing the scale level of an interval or ratio variable to ordinal in order to conduct multinomial regression takes out variation from the data and reduces the statistical power of the test.  Whereas sequential one-way discriminant analysis assumes continuous variables, logistic/ multinomial/ ordinal regression assume categorical data and thus use a Chi-Square like matrix structure.  The disadvantage of this is that extremely large sample sizes are needed for designs with many factors or factor levels.

Moreover, sequential one-way discriminant analysis is a better predictor of group membership if the assumptions of multivariate normality, homoscedasticity, and independence are met.  Thus we can prevent over-fitting of the model, that is to say we can restrict the model to the relevant independent variables and focus subsequent analyses.  Also, because it is an analysis of the covariance, we can measure the discriminating power of a predictor variable when removing the effects of the other independent predictors.

The Sequential One-Way Discriminant Analysis in SPSS

The research question for the sequential one-way discriminant analysis is as follows:

The students in our sample were taught with different methods and their ability in different tasks was repeatedly graded on aptitude tests and exams.  At the end of the study the pupils go to chose from three computer game ‘thank you’ gifts: a sports game (Superblaster), a puzzle game (Puzzle Mania) and an action game (Polar Bear Olympics).  The researchers wish to learn what guided the pupils’ choice of gift.

The independent variables are the three test scores from the standardized mathematical, reading, writing test (viz.  Test_Score, Test2_Score, and Test3_score).  From previous correlation analysis we suspect that the writing and the reading score have the highest influence on the outcome.  In our logistic regression we found that pupils scoring lower had higher risk ratios of preferring the action game over the sports or the puzzle game.

The sequential one way discriminant analysis is not a part of the graphical user interface of SPSS.  However, if we want include our variables in a specific order into the sequential one-way discriminant model we can do so by specifying the order in the /analysis subcommand of the Discriminant syntax.

The SPSS syntax for a sequential one-way discriminant analysis specifies the sequence of how to include the variables in the analysis by defining an inclusion level.  SPSS accepts inclusion levels from 99…0, where variables with level 0 are never included in the analysis.

DISCRIMINANT

/GROUPS=Gift(1 3)

/VARIABLES=Test_Score Test2_Score Test3_Score

/ANALYSIS Test3_Score (3), Test2_Score (2), Test_Score (1)

/METHOD=WILKS

/FIN=3.84

/FOUT=2.71

/PRIORS SIZE

/HISTORY

/STATISTICS=BOXM COEFF

/CLASSIFY=NONMISSING POOLED.

The Output of the Sequential One-Way Discriminant Analysis

The first couple of tables in the output of the sequential one-way discriminant analysis illustrate the model design and the sample size.  The first relevant table is Box’s M test, which tests the null hypothesis that the covariances of the dependent variable and every given pair of independent variables are equal for all groups in the independent variable.  We find that Box’s M is not significant therefore we cannot assume equality of covariances.  The discriminant analysis is robust against the violation of this assumption.

Test Results
Box’s M 34.739
F Approx. 5.627
df1 6
df2 205820.708
Sig. .000
Tests null hypothesis of equal population covariance matrices.

The next table shows the variables entered in each step of the sequential one-way discriminant analysis.

Variables Entered/Removeda,b,c,d
Step Entered Wilks’ Lambda
Statistic df1 df2 df3 Exact F
Statistic df1 df2 Sig.
1 Writing Test .348 1 2 104.000 97.457 2 104.000 .000
2 Reading Test .150 2 2 104.000 81.293 4 206.000 .000
At each step, the variable that minimizes the overall Wilks’ Lambda is entered.
a. Maximum number of steps is 4.
b. Minimum partial F to enter is 3.84.
c. Maximum partial F to remove is 2.71.
d. F level, tolerance, or VIN insufficient for further computation.

We find that the writing test score was first entered, followed by the reading test score (based on the Wilks’ Lambda).  The third variable we specified, the math test score, was not entered because it did not explain anymore variance of the data.  It also shows the significance of each variable by running the F-test for the specified model.

Eigenvalues
Function Eigenvalue % of Variance Cumulative % Canonical Correlation
1 5.601a 99.9 99.9 .921
2 .007a .1 100.0 .085
a. First 2 canonical discriminant functions were used in the analysis.

The next few tables show the variables in the analysis and the variables not in the analysis and Wilk’s Lambda.  All of these tables contain virtually the same data.  The next table shows the discriminant eigenvalues.  The eigenvalues are defined as

and are maximized using ordinary least squares.  We find that the first function explains 99.9% of the variance and the second function explains the rest.  This is quite unusual for a discriminant model.  This table also shows the canonical correlation coefficient for the sequential discriminant analysis that is defined as

The next table in the output of our sequential one-way discriminant function describes the standardized canonical discrim coefficient—these are the estimated Beta coefficients.  Since we do have more than two groups in our analysis we need at least two functions (each canonical discrim function can differentiate between two groups).  We see that

Y1 = .709 * Writing Test + .827 * Reading Test

Y2 = .723 * Writing Test – .585 * Reading Test

Standardized Canonical Discriminant Function Coefficients
Function
1 2
Writing Test .709 .723
Reading Test .827 -.585

 

This however has no inherent meaning other than knowing that a high score on both tests gives function 1 a high value, while simultaneously giving function 2 a lower value.  In interpreting this table, we need to look at the group centroids of our one-way sequential discriminant analysis at the same time.

Functions at Group Centroids
Gift chosen by pupil Function
1 2
Superblaster -2.506 -.060
Puzzle Mania -.276 .131
Polar Bear Olympics 3.023 -.045
Unstandardized canonical discriminant functions evaluated at group means

We find that a high score of three on the first function indicates a preference for the sports game, a score close to zero indicates a preference for the puzzle game, and a low score indicates a preference for the action game.  Remember that this first function explained 99.9% of our variance in the data.  We also know that the sequential one-way discriminant function 1 scored higher for high results in the writing and the reading tests, whereby reading was a bit more important than writing.

Classification Function Coefficients
Gift chosen by pupil
Superblaster Puzzle Mania Polar Bear Olympics
Writing Test .151 .403 .727
Reading Test .206 .464 .885
(Constant) -2.249 -8.521 -26.402
Fisher’s linear discriminant functions

Thus we can say that pupils who did well on our reading and writing test are more likely to choose the sports game, and pupils who did not do well on the tests are more likely to choose the action game.

The final interesting table in the sequential one-way discriminant function output is the classification coefficient table.  Fisher’s classification coefficients can be used to predict group membership.

In our case we get three functions:

Superblaster = -2.249 + .151 * writing + .206 * reading
Puzzle Mania = -8.521 + .403 * writing + .464 * reading
Polar Bear Olympics = -26.402 + .727 * writing + .885 * reading

If we would plug in the numbers of a new student joining class who score 40 on both tests we would get 3 scores:

Superblaster = 12.031
Puzzle Mania = 26.159
Polar Bear Olympics = 38.078

Thus the student would most likely choose the Polar Bear Olympics (the highest value predicts the group membership).

The table classification results show that specifically in the case where we predicted that the student would choose the sports game, 13.9% chose the puzzle game instead.  This serves to alert us to the risk behind this classification function.