The arithmetic mean (aka mean or average), the median, and the mode

In this introductory statistics article, we will explore the mean, formally known as the arithmetic mean (average) and how it’s used and abused, and compare it to the median and the mode.

When can the mean be calculated?

There are various ways to classify variables. One useful way is to distinguish between continuous and categorical data. Data is continuous if it can (at least in theory) take on any number. Data is categorical if it can only take on certain numbers. For example, weight, income, age and IQ are continuous. Choice of whom to vote for (e.g. McCain or Obama) party, hair color, and marital status are categorical. We will discuss this more in a later article.

When you have continuous data, two things that you often want to know are “What values are likely?” and “How spread out are the values?” Today, we will look at the first question, which, in statistician’s language, is called central tendency. The most common measure of central tendency is the mean, more formally the arithmetic mean, and less formally the average. (To see why the mean makes no sense for categorical data – well, what’s the average of McCain and Obama? Or of married and single? Perhaps the latter is “engaged”?)

How to calculate the mean

The mean is probably familiar, even if you only know it as the average. Add up the numbers, divide by how many numbers there are, and you’ve got the mean. So, for example, if the IQs of the people in your family are

155 (that would be you)

135 (your sister)

and

70 (her husband)

then the mean is (155 + 135 + 70)/ 3 = 120

Or, suppose the heights of the students in introductory psychology are (in inches, rounded to the nearest inch)

64 65 64 67 64 67 66 70 66 66

66 64 69 69 62 67 64 59 66 67

65 71 67 68 59 69 67 65 68 66

68 67 75 67 69 70 67 76 67 70

68 67 78 67 73 64 75 65 70 68.

The arithmetic mean of the above is 67.36 inches.

The mean: When not to use it

The mean is a bad choice if the data are skewed, which means that there is a ‘tail’ to the distribution on one side, but not the other. One common example of this is income. Some people make a whole lot more than the average person, but no one makes that much less. For instance, if the average income in the USA is $30,000 per year (I made that up) then there are some people who make millions more than that, but the poorest people make $30,000 less. When the data are skewed, the median and the trimmed or Winsorized mean are good choices. (You don’t see the trimmed mean much, but it can be very useful), I will cover these in later articles.

The mean is also a bad choice if the data are multimodal, which means they have two or more “humps”. For instance, if you had data on the heights of basketball players and jockeys, taking the overall mean would not be very informative.

The mean: What can go wrong

People sometimes try to average things that shouldn’t be averaged. The most common is to average percentages. This is a bad idea. Here are some data from the last presidential election I will use just 4 states, to keep it simple; the same thing applies with all 50):

State Obama McCain

CA 61% 37%

NY 63% 36%

WY 33% 65%

UT 34% 63%

If one averages the percentages, one would get 48% for Obama (61 + 63 + 33 + 34)/4 and 50% for McCain (37 + 36 + 65 + 63) but that isn’t right. A percentage is a form of a fraction, and you have to add the numerators and denominators and then form a new percentage, that is, add up the NUMBER voting Dem and Repub. and then get the percentage from the total. Here are the total voting, in millions of people:

State Obama McCain

CA 8.2 5.0

NY 4.8 2.8

WY 0.1 0.2

UT 0.3 0.6

Tot 13.4 8.6

In these four states, Obama got 61% of the vote..

As another example, suppose I ask the following;

In September, Joe’s average gas mileage was 30 mpg. In October, it was 20 mpg. What was his average gas mileage for September and October? You might think; 20 + 30 = 50, divide by 2 = 25. But that’s not the mean, because he might have driven different distances in the two months. If he drove 2000 miles in September, and 500 in October, then in September he used 2000/30 = 67 gallons, and in October he used 500/20 = 25 gallons. So, in total, he used 92 gallons to drive 2500 miles, and the mean is 2500/92 = 27.2 mpg.

For the trimmed mean and median, it is useful to first sort the numbers from smallest to largest:

59 59 62 64 64 64 64 64 64 65

65 65 65 66 66 66 66 66 66 67

67 67 67 67 67 67 67 67 67 67

67 68 68 68 68 68 69 69 69 69

70 70 70 70 71 73 75 75 76 78.

The median is the number that splits the data into two equal halves, with half being higher, and half lower (there are slightly more technical definitions, to deal with things like ties, and sparse data, but this will do for our purposes). The median height in the psychology class is 67 inches.

A less commonly used measure is the trimmed mean. The trimmed mean is the mean after you throw out some extreme values (typically the highest 10% and the lowest 10%).

For the 10% trimmed mean, we delete the 5 smallest (59,59, 62, 64, 64) and 5 largest values (73, 75, 75, 76, 78(, and take the mean of the remaining 40. Here, the trimmed mean is the same as the regular mean – 67.36 inches.

Sometimes, though, the trimmed mean and median can be very different from the mean. Take income; suppose you sample 50 American adults and get household incomes of

4416 11280 7339 7882 3821 14367 11223 11197 5152 6169

28058 33362 26730 23546 32838 27679 25582 31776 26288 20113

45847 44699 39966 35535 52081 41582 52301 41308 36916 44841

76424 55663 64971 58316 55778 65888 70922 70174 76397 81837

111359 114360 153072 141380 135553 97504 136559 119445 160962 405354

Sources: http://en.wikipedia.org/wiki/Household_income_in_the_United_States.

Then the mean is $60,916, but the median is $43,140, and the 10% trimmed mean is $50,540.

While many people have no problem with the median, sometimes the trimmed mean is looked on with suspicion. But the median is just the 50% trimmed mean – it ignores all the data except the central point.

One other common measure is the mode which is simply the most common value.

When do you want each? When do you want to use none of them?

There are some situations where no measure works well. The most common is when the data are multimodal. That means that the data have common values that are separated by some uncommon values. For example, if you had a bunch of athletes from different sports (basketball players, football players, and jockeys), and were interested in their weights, then no measure of central tendency would be good, not the mean, nor the trimmed mean nor the median, nor the mode.

But, more often, you want some measure of central tendency, and have to decide which one.

When the mean is good

The mean is the best measure of central tendency when the data are roughly symmetric and have no outliers, or when there are outliers, but you want them to be included. Most physical measurements (e.g. height, weight) are examples of the first type, at least if you have a single population (unlike the athlete example). The mean height of adult men is a good measure. An example of the second case is when you are trying to estimate, for example, the total purchasing power of a group of people – say, all the employees of Microsoft. This will be distorted by Bill Gates’ income, but, in this case, you want that distortion. Here, though, you might argue that you aren’t looking at central tendency at all, but at totals.

When the trimmed mean is good

The trimmed mean is the best measure of central tendency when the data have a few outliers, and those outliers are not important for your hypothesis, or will distort your research. It is also good when the data are somewhat skewed. I would argue that the trimmed mean should be used more often, and the median less often.

When the median is good

The median is the best measure of central tendency when the data are very skewed, or when there are a huge number of outliers. Because it is more widely known than the trimmed mean, it may be good when your audience is not statistically sophisticated. The classic example is income. In most populations, incomes will be highly right skewed. That is, there will be some people whose incomes are much, much higher than all the others. When the data are right skewed, the mean will be higher than the median. When they are left skewed (much less common) the mean will be lower than the median.

When the mode is best

The mode is sometimes also a good measure of central tendency. Suppose, for example, you are reporting on a country where nearly everyone is a peasant making almost nothing, and there are a few multibillionaires making a lot, and a few more people in the middle. Like this

Income Number of people

$100 per year or less 1,000,000

$1000 to $100,000 per year 10,000

More 500

then the mean would be distorted by the few people making huge amounts, and the median would be distorted by the people making a middle amount; the mode would be $100 per year, and that would be a good representation of the income. For the mode to be useful for continuous data, you may have to do quite a bit of rounding.