Monday, 28 November 2011

Lies, Damn Lies and Government Statistics (Image Intensive)

Government Data Set 1949-2010
Above I have presented a UK Government Data Set which has been widely circulated. The graph plots annual figures, not cumulative data. But I am going to break a major rule of data presentation, and not label my axes (the x-axis is year). What the data set is does not matter, and I do not want to prejudice your opinion*.

Edit: You can now see the data if you wish.

Next we have the official Government predictions. This revision is from May 2010, therefore allows for data inclusive of 2009, but not 2010.
Government Predictions to 2035 (Orange)
Notice anything odd?

No, in this form neither do I, the data appears to follow a pretty linear correlation. So continuing this trend may well be justified. However there are a few blips, around 1974, the early 1990's and, most recently, since 2007.

It is this recent "blip" which I am going to focus on.

Is it actually a "blip"?

For a rough-and-ready estimate I am going to use what is known as a "Moving Average". This smooths data, enabling long term trends to become more apparent.

The graph below shows a 5-year "Simple Moving Average" (SMA) of the data set.
5-year Simple Moving Average
It is difficult to see in the graph; last year is the only year on record that the 5-year SMA has fallen.

A more robust approach is to use the "Simple Moving Median" (SMM). The next graph is a 5-year SMM.
5-year Simple Moving Median
In this graph the last last 3 years are the longest period on record that the 5-year SMM has remained constant (ie. not increased). The SMM has never fallen over the data set.

We can apply even more smoothing by using 10-year SMA and SMMs (the next 2 graphs).
10-year Simple Moving Average
10-year Simple Moving Median
The last year is the only year on record that the 10-year SMM has not increased (it remained constant).

However, looking at our 10-year SMA we have a pretty straight line, we could almost draw a prediction which looks pretty like the Government's.

There is another way to get an idea of where the data is going, examining change year-on-year (without using complex models or looking for non-linear trends).

I am going to use % change year-on-year (graph below).
% Change, Year-on-Year
Here the trend appears to be an inverse correlation (% change falls as time increases).

To speed things up a bit I am going to use a single smoothed graph, the most smoothed, most robust method, ie. a 10-year SMM.
% Change, 10-year Simple Moving Median
Now for the important part:

To be able to predict a linear trend in the data we would expect a constant % change. However, we have a % change which tends to fall with time.

Therefore a more sensible prediction is not a linear increase, it is at the very least a trend towards a plateau, if not a peak followed by a decline.

Unfortunately the Government is working with their own prediction, I predict that this will lead to non-optimal policy.

Although, of course, politicians ignore all statistics and advice that they disagree with.

* I will tell you what the data is later, and provide my sources and calculations should you wish to do some data-hacking yourself.

No comments:

Post a Comment