Research Blog

By Craig Kolb, Acentric Marketing Research, 17 February 2016

Widely differing average income statistics are reported in the media. There are five main reasons for this: 1.) the type of statistic used, 2.) exclusions/inclusions of certain population groups, 3.) issues of representation, 4.) time lags and 5.) census updates.

The type of statistic used

Recently Time Magazine stated that “Flint’s 100,000 residents live below the poverty line; the median household income ($24,834) is half the average across Michigan ($48,411)”. (1) In one sentence Time magazine muddled two totally different statistics. In actual fact both numbers are medians.

In everyday English it seems the word average is used very loosely, and could refer to any one of a number of different statistics measuring central tendency. However the median and mean are two totally different statistics and should never be confused .

The median divides populations (or samples) in two, so that 50% are lower than the median and 50% are above the median. For instance, based on AMPs 2012-2013 figures, the estimated South African household-median income is around R6,000 – meaning that half the population of households is below this value and half above. Themain advantage in using the median is that is not easily distorted by a small minority with a very large income (outliers).

In contrast, the mean is calculated by summing the incomes of the population of households and dividing by the population size. Many people refer to this as the average, and it should be a very familiar calculation. Based on AMPs 2012-2013 figures, an estimate of South Africa’s household mean income is around R12,000. As can be seen, the median and mean are quite different.

In reality, more circumspect methods are used to provide an approximation to the median and mean, since very rarely would each individual household’s exact income be available to statisticians – instead survey and census data is usually collected in the form of income brackets, also referred to as intervals (F. Thomas Juster pioneered the use of brackets as a way to improve survey response rates). The resulting frequency tabulations are referred to as grouped data.
A common approach to estimating the mean when faced with grouped data is to calculate midpoint estimates for each income bracket (interval) which are then multiplied by the sub-population in each bracket. The sum of these products are then divided by the total population size.

Of course, using the midpoint to represent the interval is an approximation of the actual data, and sometimes far from optimal, since the distribution within each bracket is unknown. In some cases researchers may instead use logarithmic means or other estimates of the ‘midpoint’. However, probably an even greater source of variation in reported medians and means arises when estimating the midpoint of the top income bracket; as this is usually unbounded. Each analyst will use his or her own approach, leading to differences in the estimated median or mean income.

Who is included and who is excluded?

Frequently income statistics are reported in the media that exclude agricultural workers. For instance Statistics South Africa’s September 2015 QES estimated the monthly average gross-salary as R17,387 – a figure which excludes agricultural workers, resulting in a higher figure than if agricultural workers were included. (2) Another common survey practice is to only survey urban areas, resulting in higher estimates. Usually the source should indicate the surveys population coverage in the survey population definition – but not always.

Issues of representation

An example of how issues of representation affect reported income is demonstrated by the website Average Salary Survey. The survey is biased in two ways – self-selection bias and coverage bias. Self-selection bias occurs when individuals initiate contact with the researcher, while coverage bias relates to potential respondents in the population of interest being excluded – due to the nature of the methodology used. In this case coverage bias exists as the survey is restricted to those with internet access. No attempt is made to correct for these biases in this example, and so the estimate provided is a vast overestimate. According to the Average Salary Survey website, South Africa’s average personal income is R401,310 per annum (3).

Time lags

Another common source of confusion is the time lag between the date of data collection and the date of publication. In many cases, research reports are dated according to the date of publication, not the date of data collection, and this needs to be kept in mind when comparing different sources of information.

Original census versus updates

While the original census figures may be reported as is, sometimes attempts are made to estimate how the results may have changed over time. Surveys such as the Quarterly Employment Statistics survey or AMPs survey may be used to provide updated estimates of income distributions which in turn affect the estimated median or mean. It should also be kept in mind that survey weights are estimated against target profiles which may themselves be based on other surveys, or based on models of population dynamics. As a result, official sources may provide figures different from the original census data.

So which statistic should you use?

It really depends on what you intend to use the statistic for.
If you want a better understanding of what the ‘typical’ income is, the median provides a better reflection. This is especially the case when comparing countries and regions, the median makes the most sense as an indicator of the typical income in a country. The mean is too easily affected by differing amounts of inequality. If you find yourself in a situation where the mean is the only statistic available, make sure you also obtain a measure of inequality – such as the Gini coefficient or the Palma ratio. If there is a large difference between geographies, the mean is not likely to be comparable.

If you want to estimate sums, the mean makes more sense. For instance, you may need to know the total income earned in a region and you have a population/household count available. Simply multiplying by the mean provides a more accurate estimate than the median.

Works Cited
1. Sanburn, J. The Toxic Tap. Time Magazine. 1 February, 2016.
2. Statistics South Africa. Key findings: P0277 – Quarterly Employment Statistics (QES). [Online] 9 2015. [Cited: 16 2 2016.]
3. Average Salary Survey. Average Salary Survey. [Online] [Cited: 16 February 2016.]

Copyright reserved, Acentric Marketing Research (Pty) LTD & Craig Kolb, 2016