the science behind breakthrough improvements

First step
You can manage,
  what you can measure;
You can measure,
  what you can define;
You can define,
  what you understand.


discover further
  • Six Sigma Basics
  • Statistics
  • Probability Theory
  • Methods, Tools & Techniques

  • Subscribe RSS

information
  • About Us
  • Terms of Use
  • Contact Us
  • Links



Valid XHTML 1.0 Strict

Box Plot

Share/Comment
Tweet This Tweet Share on Facebook Facebook
Send feedback Feedback e-mail to a friend EMail

Box Plot provides an intuitive graphical representation of the five number summary of a dataset. The five number summary consists of Minimum, Q1, Q2 or Median, Q3, and Maximum of a dataset. John W. Tukey introduced the concept of Box Plot in his book Exploratory Data Analysis, published in 1977. It is also referred as Box & Whisker Plot.

Let us understand the power of box plot through a series of examples; the following example shows the box plot along with the sample data.

Example of Box Plot with Sample Data

The box represents the inter quartile range (IQR = Q3-Q1) where its left border (also called hinge) corresponds to the first quartile (Q1) and the right border corresponds to the third quartile (Q3). Therefore, the middle 50% of data values fall within the box. The line in the middle represents the median of the data. The left whisker represents the smallest 25% of data values with its left most end corresponding to the minimum value of the data. Similarly, the right whisker represents the largest 25% of data values with its right most end corresponding to the maximum value of the data.

EXPLORE: Central Tendency and Meausre of Dispersion

Let us look at the second data set and the corresponding box plot. The data has been superimposed on the histogram with box plot aligned perfectly on the top to give you a crisp and easy to understand picture.

Box Plot with Histogram

The histogram in the figure clearly suggests that,

  1. It has no skew implying that it has symmetrical distribution.
  2. It has long tails i.e. it possibly has outliers.

Now, it is time to look at the box plot. Notice that both the whiskers are much longer than the length of the box (IQR) - an indication of the possible presence of outliers. In fact, Tukey suggests that an outlier is a point that is greater than or less than 1.5 times the IQR. Here is the same box plot, but with outliers (0.95 and 1.05) clearly highlighted as per Tukey's recommendations.

Box Plot with Histogram

The above box plots (with or without outliers) also reveal that the two whiskers are of equal length and the median lies right in the middle of the box - an indication of symmetrical distribution. Any deviation from this leads to a non-symmetrical distribution, as illustrated in the following box plot.

Box Plot with skew

Box plot also serves as a great way to quickly compare two or more series by juxtaposing the box plots of the series to be investigated. The following plot shows the previous two box plots juxtaposed clearly to highlight the differences (or similarities) in central tendencies and dispersions.

Juxtaposed Box Plots

Important Observations

  1. Box plot is based on robust statistics, i.e. it is more tolerant (or robust) to the presence of outliers.
  2. It gives an indication of shape of distribution in terms of symmetry or skewness.
  3. It is an excellent means to determine if there are similarities (or differences) between two or more data sets by juxtaposing their box plots.

June, 2011   |  Permalink   |  Home


comments powered by Disqus

Commenting Guidelines
We hope the conversations that take place on “discover6sigma.org” will be constructive in context of the topic. To ensure the quality of the discussion stays in check, our moderators will review all the comments and may edit them for clarity and relevance. The comments that are posted using fowl language, promotional phrases and are not relevant in the said context, may be deleted as per moderators discretion. By posting a comment here, you agree to give “discover6sigma.org” the rights to use the contents of your comments anywhere.

Copyright © 2005-14 Sanjaya Kumar Saxena
Rights reserved. Please seek permission for reproduction.