Five-number summary

From Wikipedia, the free encyclopedia - View original article

 
Jump to: navigation, search

The five-number summary is a descriptive statistic that provides information about a set of observations. It consists of the five most important sample percentiles:

  1. the sample minimum (smallest observation)
  2. the lower quartile or first quartile
  3. the median (middle value)
  4. the upper quartile or third quartile
  5. the sample maximum (largest observation)

In order for these statistics to exist the observations must be from a univariate variable that can be measured on an ordinal, interval or ratio scale.

Use and representation[edit]

The five-number summary provides a concise summary of the distribution of the observations. Reporting five numbers avoids the need to decide on the most appropriate summary statistic. The five-number summary gives information about the location (from the median), spread (from the quartiles) and range (from the sample minimum and maximum) of the observations. Since it reports order statistics (rather than, say, the mean) the five-number summary is appropriate for ordinal measurements, as well as interval and ratio measurements.

It is possible to quickly compare several sets of observations by comparing their five-number summaries, which can be represented graphically using a boxplot.

In addition to the points themselves, many L-estimators can be computed from the five-number summary, including interquartile range, midhinge, range, mid-range, and trimean.

The five-number summary is sometimes represented as in the following table:

median
1st quartile3rd quartile
MinimumMaximum

Example[edit]

This example calculates the five-number summary for the following set of observations: 0, 0, 1, 2, 63, 61, 27, 13. These are the number of moons of each planet in the Solar System.

It helps to put the observations in ascending order: 0, 0, 1, 2, 13, 27, 61, 63. There are eight observations, so the median is the mean of the two middle numbers, (2 + 13)/2 = 7.5. Splitting the observations either side of the median gives two groups of four observations. The median of the first group is the lower or first quartile, and is equal to (0 + 1)/2 = 0.5. The median of the second group is the upper or third quartile, and is equal to (27 + 61)/2 = 44. The smallest and largest observations are 0 and 63.

So the five-number summary would be 0, 0.5, 7.5, 44, 63.

Example in R[edit]

It is possible to calculate the five-number summary in the R programming language using the fivenum function. The summary function, when applied to a vector, displays the five-number summary together with the mean (which is not itself a part of the five-number summary).

 > moons <- c(0, 0, 1, 2, 63, 61, 27, 13) > fivenum(moons) [1]  0.0  0.5  7.5 44.0 63.0 > summary(moons)    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.      0.0     0.5     7.5    20.88   44.0    63.0  

See also[edit]

References[edit]