Conduct and Interpret a Mann-Whitney U-Test

What is the Mann-Whitney U-Test?

The Mann-Whitney U-test, is a statistical comparison of the mean.  The U-test is a member of the bigger group of dependence tests.  Dependence tests assume that the variables in the analysis can be split into independent and dependent variables.  A dependence tests that compares the mean scores of an independent and a dependent variable assumes that differences in the mean score of the dependent variable are caused by the independent variable.  In most analyses the independent variable is also called factor, because the factor splits the sample in two or more groups, also called factor steps.

Other dependency tests that compare the mean scores of two or more groups are the F-test, ANOVA and the t-test family.  Unlike the t-test and F-test, the Mann-Whitney U-test is a non-paracontinuous-level test.  That means that the test does not assume any properties regarding the distribution of the underlying variables in the analysis.  This makes the Mann-Whitney U-test the analysis to use when analyzing variables of ordinal scale.  The Mann-Whitney U-test is also the mathematical basis for the H-test (also called Kruskal Wallis H), which is basically nothing more than a series of pairwise U-tests. Because the test was initially designed in 1945 by Wilcoxon for two samples of the same size and in 1947 further developed by Mann and Whitney to cover different sample sizes the test is also called Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, Wilcoxon–Mann–Whitney test, or Wilcoxon two-sample test. The Mann-Whitney U-test is mathematically identical to conducting an independent sample t-test (also called 2-sample t-test) with ranked values.  This approach is similar to the step from Pearson’s bivariate correlation coefficient to Spearman’s rho.  The U-test, however, does apply a pooled ranking of all variables. The U-test is a non-paracontinuous-level test, in contrast to the t-tests and the F-test; it does not compare mean scores but median scores of two samples.  Thus it is much more robust against outliers and heavy tail distributions.  Because the Mann-Whitney U-test is a non-paracontinuous-level test it does not require a special distribution of the dependent variable in the analysis.  Thus it is the best test to compare mean scores when the dependent variable is not normally distributed and at least of ordinal scale. For the test of significance of the Mann-Whitney U-test it is assumed that with n > 80 or each of the two samples at least > 30 the distribution of the U-value from the sample approximates normal distribution.  The U-value calculated with the sample can be compared against the normal distribution to calculate the confidence level. The goal of the test is to test for differences of the media that are caused by the independent variable.  Another interpretation of the test is to test if one sample stochastically dominates the other sample.  The U-value represents the number of times observations in one sample precede observations in the other sample in the ranking.  Which is that with the two samples X and Y the Prob(X>Y) > Prob(Y>X).  Sometimes it also can be found that the Mann-Whitney U-test tests whether the two samples are from the same population because they have the same distribution.  Other non-paracontinuous-level tests to compare the mean score are the Kolmogorov-Smirnov Z-test, and the Wilcoxon sign test.

The Mann-Whitney U-Test in SPSS The research question for our U-Test is as follows: Do the students that passed the exam achieve a higher grade on the standardized reading test? The question indicates that the independent variable is whether the students have passed the final exam or failed the final exam, and the dependent variable is the grade achieved on the standardized reading test (A to F). The Mann-Whitney U-Test can be found in Analyze/Nonparacontinuous-level Tests/Legacy Dialogs/2 Independent Samples…

In the dialog box for the nonparacontinuous-level two independent samples test, we select the ordinal test variable ‘mid-term exam 1’, which contains the pooled ranks, and our nominal grouping variable ‘Exam‘.  With a click on ‘Define Groups…‘ we need to specify the valid values for the grouping variable Exam, which in this case are 0 = fail and 1 = pass. We also need to select the Test Type.  The Mann-Whitney U-Test is marked by default.  Like the Mann-Whitney U-Test the Kolmogorov-Smirnov Z-Test and the Wald-Wolfowitz runs-test have the null hypothesis that both samples are from the same population.  Moses extreme reactions test has a different null hypothesis: the range of both samples is the same. The U-test compares the ranking, Z-test compares the differences in distributions, Wald-Wolfowitz compares sequences in ranking, and Moses compares the ranges of the two samples.  The Kolmogorov-Smirnov Z-Test requires continuous-level data (interval or ratio scale), the Mann-Whitney U-Test, Wald-Wolfowitz runs, and Moses extreme reactions require ordinal data. If we select Mann-Whitney U, SPSS will calculate the U-value and Wilcoxon’s W, which the sum of the ranks for the smaller sample.  If the values in the sample are not already ranked, SPSS will sort the observations according to the test variable and assign ranks to each observation. The dialog box Exact… allows us to specify an exact non-paracontinuous-level test of significance and the dialog box Options… defines how missing values are managed and if SPSS should output additional descriptive statistics.