Stem and Leaf Plot

A stem and leaf plot is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient and easily drawn form. A stem and leaf plot is similar to a histogram but is usually a more informative display for relatively small data sets (<100 data points). It provides a table as well as a picture of the data and from it we can readily write down the data in order of magnitude, which is useful for many statistical procedures. (Definition taken from Valerie J. Easton and John H. McColl's Statistics Glossary v1.1)


Example

The following data represent measurements of carbon monoxide content (in mg) for 25 brands of cigarettes:
13.6, 16.6, 23.5, 10.2, 5.4, 15.0, 9.0, 12.3, 16.3, 15.4, 13.0, 14.4, 10.0, 10.2, 9.5, 1.5, 18.5, 12.6, 17.5, 4.9, 15.9, 8.5, 10.6, 13.9, 14.9.

A MINITAB stemplot for this data (created using the "STEM" command) is shown to the left. MINITAB first truncates the data by rounding down to integers, then sorts the data. The resulting dataset is the following:
1, 4, 5, 8, 9, 9, 10, 10, 10, 10, 12, 12, 13, 13, 13, 14, 14, 15, 15, 15, 16, 16, 17, 18, 23.

The first column of the MINITAB stemplot counts the number of values from the top down and from the bottom up to the middle value (the median). The number in parantheses represents the count of values in the row containing the median, which is the thirteenth ordered value in this example, 13.0.

The second column plots the stems, tens of milligrams of carbon monoxide content. Because the range of the data is small (the values for the stems are 0, 1, and 2), MINITAB divides the third column, which plots milligrams as leaves, into fifths. In other words, the first row includes the leaf values 0 and 1, the second row includes the leaf values 2 and 3, etc. Since there are no values between 1.5 and 4.9, the second row contains no data points. The stemplot illustrates that the majority of the measurements lie in the teens, with only 6 of the 25 values less than 10 and only 1 value greater than 20.

Data source: Mendenhall, William, and Sincich, Terry (1992), Statistics for Engineering and the Sciences (3rd ed.), New York: Dellen Publishing Co. (ISBN: 0 02380552 8) (Original source: Federal Trade Commission, USA) Dataset available through the JSE Dataset Archive.

RETURN TO MAIN PAGE.