Mann-Whitney U-Test
© 1998 by Dr. Thomas W. MacFarland -- All Rights Reserved
************ mann_whi.doc ************ Background: The Mann-Whitney U test is often viewed as the nonparametric equivalent of Student's t-test. Like the parametric Student's t-test, the non- parametric Mann-Whitney U test: -- is used to determine if a difference exists between two "groups," however you define "group" -- is ideally dependent on random selection of subjects into their respective group The major difference between the Mann-Whitney U Test and Student's t-Test involves the concept of normal distribution: -- Mann-Whitney is a nonparametric test. -- Normal distribution of data is not necessary for use of this test. There is a table of U values in many statistics texts. If you use this table: -- The column number should be the number of the larger sample. -- The row number should be the number of the smaller sample. -- Samples of equal size will use n by n for determining the criterion U statistic. If you use SPSS (or many of the other statistical packages) for data analysis you may find it far more convenient to use either z values or p values in the output file to determine significance. When using z values: -- If the observed z value does not equal or exceed the critical z value of 1.96 (p <= .05 critical z value for a two-tailed test), then you can assume that the null hypothesis is correct and that there is no difference between groups. -- If the z value, however, exceeds 1.96 then you have evidence to reject the null hypothesis. Or, you may find it more convenient to observe the printed p value. Scenario: This study examines if there any differences in outcomes in a Pascal programming course between students who received Computer Based Training, as opposed to students who received traditional lecture: -- Differences (if indeed they exist) between the two teaching formats will be measured by student performance on a common final examination. -- Based on prior experience with the test instrument, it is suspected that outcomes are not normally distributed (e.g., bell-shaped curve) but are instead skewed to the right. Accordingly, Student's t-Test is not the appropriate test for difference between the two groups. Instead, this study will be based on the use of the Mann-Whitney U Test. In this study random selection was used to assign the 30 students in Mr. Seeger's Pascal programming course into one of two groups: 1. Students in group 1 received instruction through the use of Computer Based Training (CBT). 2. Subjects in group 2 received instruction through the use of traditional lecture. A summary of the study is presented in Table 1. Table 1 Pascal Programming Course Final Examination Scores: Breakouts by Computer Based Training and Traditional Lecture ==================================================== Assigned Group ============== 1 = CBT Student Number 2 = Lecture Exam Score ---------------------------------------------------- 01 1 080 02 1 082 03 1 091 04 1 100 05 1 076 06 1 065 07 1 085 08 1 088 09 1 097 10 1 055 11 1 069 12 1 088 13 1 075 14 1 097 15 1 081 16 2 072 17 2 089 18 2 086 19 2 085 20 2 099 21 2 047 22 2 079 23 2 088 24 2 100 25 2 076 26 2 083 27 2 094 28 2 084 29 2 082 30 2 093 ---------------------------------------------------- Ho: Null Hypothesis: There is no difference in final examination test scores in a Pascal programming course between students who received Computer Based Training and students who received traditional lecture (p <= .05). Files: 1. mann_whi.doc 2. mann_whi.dat 3. mann_whi.r01 4. mann_whi.o01 5. mann_whi.con 6. mann_whi.lis Command: At the Unix prompt (%), key: %spss -m < mann_whi.r01 > mann_whi.o01 ************ mann_whi.dat ************ 01 1 080 02 1 082 03 1 091 04 1 100 05 1 076 06 1 065 07 1 085 08 1 088 09 1 097 10 1 055 11 1 069 12 1 088 13 1 075 14 1 097 15 1 081 16 2 072 17 2 089 18 2 086 19 2 085 20 2 099 21 2 047 22 2 079 23 2 088 24 2 100 25 2 076 26 2 083 27 2 094 28 2 084 29 2 082 30 2 093 ************ mann_whi.r01 ************ SET WIDTH = 80 SET LENGTH = NONE SET CASE = UPLOW SET HEADER = NO TITLE = Sign Test COMMENT = This file examines if Computer Based Training is as equally effective as traditional lecture in a Pascal programming course. Differences between the two teaching formats will be measured by student performance on a common final examination. DATA LIST FILE = 'mann_whi.dat' FIXED / Stu_Code 20-21 Group 36 Score 49-51 Variable Labels Stu_Code "Subject Code" / Group "Assigned Group: CBT or Traditional" / Score "Common Final Examination Score" Value Labels Group 1 'Computer Based Training' 2 'Traditional Lecture' NPAR TESTS M-W = Score BY Group (1,2) ************ mann_whi.o01 ************ 1 SET WIDTH = 80 2 SET LENGTH = NONE 3 SET CASE = UPLOW 4 SET HEADER = NO 5 TITLE = Sign Test 6 COMMENT = This file examines if Computer Based Training 7 is as equally effective as traditional lecture 8 in a Pascal programming course. Differences 9 between the two teaching formats will be 10 measured by student performance on a common 11 final examination. 12 DATA LIST FILE = 'mann_whi.dat' FIXED 13 / Stu_Code 20-21 14 Group 36 15 Score 49-51 16 This command will read 1 records from mann_whi.dat Variable Rec Start End Format STU_CODE 1 20 21 F2.0 GROUP 1 36 36 F1.0 SCORE 1 49 51 F3.0 17 Variable Labels 18 Stu_Code "Subject Code" 19 / Group "Assigned Group: CBT or Traditional" 20 / Score "Common Final Examination Score" 21 22 Value Labels 23 Group 1 'Computer Based Training' 24 2 'Traditional Lecture' 25 26 NPAR TESTS M-W = Score BY Group (1,2) ***** Workspace allows for 18724 cases for NPAR tests ***** - - - - - Mann-Whitney U - Wilcoxon Rank Sum W Test SCORE Common Final Examination Score by GROUP Assigned Group: CBT or Traditional Mean Rank Cases 14.53 15 GROUP = 1 Computer Based Train 16.47 15 GROUP = 2 Traditional Lecture -- 30 Total Exact Corrected for ties U W 2-Tailed P Z 2-Tailed P 98.0 218.0 .5668 -.6020 .5472 ************ mann_whi.con ************ Outcome: Significance can of course be verified by using the computed test statistic (e.g., U) and comparing this statistic to the criterion (i.e., table) value. It is often much easier, however, to use the output file to verify interpretation of significance: p = .5472 By interpretation of the p (probability) value, it is observed that p = .55, which exceeds the Null Hypothesis declaration that p <= .05. There is certainly sufficient information to accept the Null Hypothesis and to declare that there is no difference between the two training groups in terms of final examination scores. ************ mann_whi.lis ************ % minitab MTB > outfile 'mann_whi.lis' Collecting Minitab session in file: mann_whi.lis MTB > # MINITAB addendum to mann_whi.dat MTB > read 'mann_whi.dat' c1 c2 c3 Entering data from file: mann_whi.dat 30 rows read. MTB > print c1 c2 c3 ROW C1 C2 C3 1 1 1 80 2 2 1 82 3 3 1 91 4 4 1 100 5 5 1 76 6 6 1 65 7 7 1 85 8 8 1 88 9 9 1 97 10 10 1 55 11 11 1 69 12 12 1 88 13 13 1 75 14 14 1 97 15 15 1 81 16 16 2 72 17 17 2 89 18 18 2 86 Continue? y 19 19 2 85 20 20 2 99 21 21 2 47 22 22 2 79 23 23 2 88 24 24 2 100 25 25 2 76 26 26 2 83 27 27 2 94 28 28 2 84 29 29 2 82 30 30 2 93 MTB > # I will now UNSTACK the data to get distinct MTB > # groups. MTB > unstack (c2-c3) into (c5-c6) (c8-c9); SUBC> subscripts c2. MTB > print c1 c2 c3 c4 c5 c6 c7 c8 c9 ROW C1 C2 C3 C5 C6 C8 C9 1 1 1 80 1 80 2 72 2 2 1 82 1 82 2 89 3 3 1 91 1 91 2 86 4 4 1 100 1 100 2 85 5 5 1 76 1 76 2 99 6 6 1 65 1 65 2 47 7 7 1 85 1 85 2 79 8 8 1 88 1 88 2 88 9 9 1 97 1 97 2 100 10 10 1 55 1 55 2 76 11 11 1 69 1 69 2 83 12 12 1 88 1 88 2 94 13 13 1 75 1 75 2 84 14 14 1 97 1 97 2 82 15 15 1 81 1 81 2 93 16 16 2 72 17 17 2 89 18 18 2 86 Continue? y 19 19 2 85 20 20 2 99 21 21 2 47 22 22 2 79 23 23 2 88 24 24 2 100 25 25 2 76 26 26 2 83 27 27 2 94 28 28 2 84 29 29 2 82 30 30 2 93 * NOTE * One or more variables are undefined. MTB > mannwhitney c6 c9 Mann-Whitney Confidence Interval and Test C6 N = 15 Median = 82.00 C9 N = 15 Median = 85.00 Point estimate for ETA1-ETA2 is -2.00 95.4 pct c.i. for ETA1-ETA2 is (-11.00,6.00) W = 218.0 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.5614 The test is significant at 0.5611 (adjusted for ties) Cannot reject at alpha = 0.05 MTB > stop -------------------------- Disclaimer: All care was used to prepare the information in this tutorial. Even so, the author does not and cannot guarantee the accuracy of this information. The author disclaims any and all injury that may come about from the use of this tutorial. As always, students and all others should check with their advisor(s) and/or other appropriate professionals for any and all assistance on research design, analysis, selected levels of significance, and interpretation of output file(s). The author is entitled to exclusive distribution of this tutorial. Readers have permission to print this tutorial for individual use, provided that the copyright statement appears and that there is no redistribution of this tutorial without permission. Prepared 980316 Revised 980914 end-of-file 'mann_whi.ssi'