Statistical Graphs

These are three of the most common statistical graphics; in fact, they are ubiquitous. Each is used for a different purpose.

The most common use of a bar graph is to show the relationship between a categorical variable and a continuous one. For example, to compare what proportion of Blacks, Whites, Latinos, Asians and others voted for Obama, we would make a graph with 5 bars (one for each group), the height of the bar would correspond to the proportion voting for Obama (ranging from 0 to 1).

There are more complex versions of bar graphs, such as stacked bars, but these can easily be confusing, and there are better alternatives for display of this type of information.

A line graph is also most commonly used to compare two variables, but, here, both should be continuous, or, at a minimum, ordered. For example, if we wanted to look at the proportion of people voting for the Democrat in presidential elections since World War II, we could have “Year” on the x-axis (that’s the horizontal one) and “proportion” on the y-axis (the vertical one). We could put a dot for each election – 1948, 1952, 1956, etc., and we could then connect those dots with lines.

There are some complications here, as well:

Should the lines be straight? If not, what form of curve would be best? Perhaps a spline curve?

If there are multiple lines (suppose we also want to track proportion voting for the Republican and the proportion voting for others) Should the lines be different colors? Which colors? (Remember, some people are color blind, and some printers are black and white). Where should the legend be?

A pie graph is typically used to show how some whole thing is divided up. For example, if we wanted to see what proportion of the people who voted for Obama were Black, White, etc, we could make a circle and divide it into slices; if (say) 33% of the people who voted for Obama are White, then the slice for ‘White’ would be about 1/3 of the pie.

Pie graphs are not particularly good graphs; William Cleveland and others have shown that they distort the data and are not easy for people to interpret correctly. For instance, if you rotate the pie, the slices will appear to be different sizes.

In general, a lot of people think they can make a good graph just by clicking on some menus in Excel or some other package; but making a good graph is complex. If you want your data to be displayed to best effect, it pays to consult with an expert in statistical graphics.

For more information see:

William S. Cleveland: Visualizing Data

The Elements of Graphing Data

Edward Tufte: The Visual Display of Quantitative Information

Visual Explanations

Envisioning Information

Beautiful Evidence

Naomi Robbins: Creating More Effective Graphs

Howard Wainer: Picturing the Uncertain World

Graphic Discovery

Visual Revelations