“What’s in a name? That which we call a rose
By any other name would smell as sweet.”
- Romeo and Juliet, Act 2, scene 2 (William Shakespeare)
Statistics is dead, long live data science! Or are the rumours wildly exaggerated?
What is in a name? Or specifically the name for the ‘study of the collection, analysis, interpretation, presentation, and organization of data’, as Wikipedia puts it – aka ‘statistics’.
Sir William Petty, who was doing statistical things in the seventeenth century, called his work ‘Political Arithmetick’. About a hundred years later, Gottfried Achenwall, gave us ‘Statistik’. And in 1791, Scottish politician, Sir John Sinclair, was the first to use ‘statistics’ in the English language.
Since then statistics has got a bad name – not helped by the confusion as to what was meant by the phrase ‘lies, damned lies and statistics‘, and the perceived inability or will of any politician to use statistics properly.
Does anyone have an authoritative source for whether or not politicians are the great transgressors that they are thought to be?
In the last couple of months, I’ve seen ‘misuse’ – through error or misunderstanding or misrepresentation – of data, statistics or a statistical term by a learned society, an award winning media outlet, and three times by a fact checking organisation. Even the Government’s Chief Scientific Adviser has not been immune to one or two minor slips.
Oh, and I’ve messed up, too, on this web site and elsewhere!
We all make mistakes. Is it because politicians use statistics so much that, from time to time, they are bound to get something horribly wrong and these instances are then used to support the general perception.
Isn’t there a little bit of the prosecutor’s fallacy here? (Hmm… note to self, that’ll make a good blog posting one day.)
Anyway back to names… some do indeed declare that ‘statistics’ is dead and it’s all about ‘data science’. Certainly, the latter term trips off the tongue better – statistics just trips.
The Bard, through Juliet, makes the point that whatever the name of a thing it is still that thing – even if that thing is an intangible area of study and application.
So should we accept that ‘statistics’ has had its day?
If the discipline were just developing now what would we call it? Apparently, Shakespeare invented at least 1700 words that we commonly use. If he were alive now, if not a playwright then he would have been a great brand developer. Would he have coined ‘statistics’? We cannot know.
In his Presidential Address to the Royal Statistical Society (pdf, opens in new window) in December 2008, Professor David Hand said that: “Statistics is both the science of uncertainty and the technology of extracting information from data.”
Professor Valerie Isham, in her Presidential Address to the Royal Statistical Society (pdf, opens in new window), referred to a then future RSS president and now National Statistician, John Pullinger. Pullinger called for a fourth R – ‘statistical Reasoning’ – to be added to the existing three (Reading, ‘Riting and ‘Rithmetic).
So statistics is a science and a technology, and requires a type of reasoning.
Does ‘data science’ capture all that? Would ‘political arithmetick’, with or without the ‘k’?
My feeling is no – for the time being I think ‘statistics’ is still alive and well. Though it might benefit from a makeover!