Big Data – Want, Don’t Want, Don’t Care
A couple of days ago the BBC World Service ran a feature on Big Data. This, apparently, is the kind of information we have been collecting since the dawn of computing. Stuff about us, willingly or unknowingly donated as we go about our daily lives. Stuff about our world and the cosmos beyond that flows into data centres from satellites, weather observatories and sensors on the ground and in the sea. And stuff collected by humans before the age of computers, meticulously digitised and integrated into models that help us predict the future. Or not.
It was the kind of radio where you could predict the age and appearance of the contributors without the benefit of video. Apart from the US Army general who’s “excited” by the use of big data to improve the treatment of brain injury in soldiers and NFL footballers, the experts were Jobsian corporate types, whose excitement leapt out from the airwaves. The kind of enthusiasts who are too young to remember that “cool” was invented in the 1940’s.
According to a chap from Virgin Media, there is the data equivalent of 83 billion high definition movies stored in computers around the world. The same guy waxed lyrical about big data helping him to eliminate mundane tasks and freeing up his time to focus on important stuff – presumably like watching high definition movies or flogging them to other people.
This led me to think about how this thing called big data affects my life – or doesn’t. On a day-to-day basis, most of us don’t think about data, unless we’re in the information business. We take it as a fact of life if we blunder into inner city London and get sent a bill because we haven’t paid the congestion charge. We’re annoyed if our phone bill is incorrect, and when we contact the call centre in Bangalore, we expect the charming operator to have our customer information in front of her in seconds. We sneer if the weather forecast is wrong, not thinking of the zillions of calculations that enable us to find out if it’s going to rain when we play golf this afternoon, or how things are looking at the holiday resort we’re heading for tomorrow morning.
Some of us remember life without much obvious big data. Early in my professional life I operated one of those monstrous IBM mainframes that took 24 hours to process a payroll run that could be done today on a laptop in five minutes. For many people growing up today, the fruits of big data are no more startling than switching on the kettle for a morning cup of tea.
So I got to asking myself what kind of data makes a positive impact on my life, what I can do without, and what leaves me indifferent. One of the mundane tasks beloved of men is apparently creating lists. As a member of the anally retentive fraternity that enjoys counting from one to ten instead of doing more important stuff, I humbly offer five types of data that I want, five that I don’t want, and five that I don’t care about. Obviously it’s a very personal view, and I doubt that anyone reading this would choose each of the items below. For which reason, I welcome your thoughts and comments.
Google: my life would be diminished without Google. Actually any old search engine will do provided it can deliver what Google does. Where else would I go to find out the plot in a movie I haven’t time to see through to the end? If I want to do a bit of personal media monitoring and see how many pics there are of me on Page 1 of the image search on my name? (The answer, by the way, is five on the first line of images – my own little personality cult.) If I want to find some massaged statistics or loony views on a subject?
Wikipedia: despised by academics who discourage its use as reference points in indigestible PhD theses, preyed upon by PR companies who want to burnish their shady clients’ backgrounds, Wikipedia is still a marvellous resource. Where else can I find – in seconds – stuff about poisonous snakes, obscure 1930s politicians, reptilian conspiracies and combatants in the Hundred Year War? Yes, it’s pretty hit and miss, and you need to exercise your critical faculties on what you read, but there has been nothing like Wikipedia in the history of humanity. I thank Jimmy Wales and his friends for that.
Census Data: censuses have been around since the time of the early Pharaohs. In the 30s, thanks to the good offices of IBM, the Nazis industrialised the process and used the Big Blue’s punchcards to record the throughput of the Holocaust. More recently, records from successive British censuses have been digitised. I’ve always been one for family history – I managed to find out how many servants a couple of my ancestors had in 1901. Before I croak, I fully intend to do the genealogy bit. How nice to discover that I’m descended from a bastard of Charles II, a cutpurse from Dartford, a merchant from Istanbul or just a boring sheep farmer from Flanders. Genealogy uses data to bring history alive, and we can’t have enough history.
Genomes: I’ve dropped enough hints over the past few years that I’d like a DNA profile done. Even better would be a full genome. Not that I want to know what horrible disease is likely to claim me in the course of time – my lifestyle almost guarantees that a self-inflicted condition will see me out before any genetic predisposition. But I’d dearly love to know how much Neanderthal lurks in my genes, or whether I’m one of the millions whose ancestors sprung from the loins of Genghis Khan. So far, my family have not come through. Maybe this Charismas.
Surveys: I love surveys, and I use them often in my business. Web-based tools like Survey Money are easy to use and cheap to buy. The best thing of all is that you can produce any result you want. That’s not true of course. But when you have some bone-headed person who can only be convinced of a point by being presented with some quantitive evidence, these days you don’t need to hire Gallup for millions of dollars to make your point.
Economic Data: Am I alone in zoning out when politicians start spouting economic statistics? You know they’re being selective. They know you know, yet still they do it. Gordon Brown, the former British Prime Minister was a walking, talking teleprinter. As Barack Obama once said, you can put lipstick on a pig, but it’s still a pig. It got so ridiculous in the recent US elections that there were teams hired by both sides specifically to dissect the statistical blather uttered by the candidates, and expose the bull.
Illegible charts: I’m one of those people who struggle to understand complex graphics. There must be a neurological condition that describes my problem – graphical recognition disorder perhaps. Nothing annoys me more than sitting through a presentation full of pie charts and process diagrams which the speaker assumes must be intelligible because some bright acolyte has spent a few days creating them in Powerpoint. And they’re so bloody detailed that you need the Hubble Telescope to make any sense of them. The worst offenders are business consultants trying to sell you something, and TV journalists trying to explain the inexplicable, especially when they’re talking about finance.
Rocket scientists: How the big financial institutions allowed a bunch of unhinged mathematicians to screw up their businesses is beyond me. Complex financial and trading models nobody – including decision makers – can understand apart from their creators, are one of the major reasons why we’ve gotten into the mess we landed in back in 2008. When a concept is too complex for an executive, a politician or an EU auditor to grasp, then God help the rest of us, because most of them are just a dumb as we are. Rocket scientists are the high priests of wilful obscurity, and data is their sacrament.
Decision support systems: I don’t trust systems that suck data from multiple sources, digest it and spew it out in the form of a few simple numbers on a sexy software dashboard. If you think that’s all you need to run a business from the commanding heights of an enterprise, then you’re no smarter than politicians who sit in ivory towers and make decisions that affect the lives of people about whom they have zero understanding. Data never, ever, ever, tells the whole story about anything.
Labour saving apps: the whole idea that we can use data to free up our time for more “valuable” activities is one of the myths of our time. Just as the mania for kitchen devices has created demand for products that we buy but never use – teasmades and vegetable peelers in our parents’ generation, and now bread makers, talking fridges and cappuccino machines – the modern software equivalents fail to take into account that if we spend all our time doing “valuable things”, we’ll keel over with stress. I actually like doing mundane things that allow my brain to idle. I have no problem with peeling potatoes, weeding the garden and checking the use-by date of food in my fridge. Doing mundane stuff puts me into a kind of dream state. It serves as a contrast to all the brain-draining stuff the modern world considers valuable. Routine is good. Boring is good. Making lists is good.
Facebook: if people want to post pictures of goofy dogs ten times a day, tell us about their drunken holidays or bombard us with obscure quotations and motivational slogans, that’s fine by me. If they end up regretting the consequences when their youthful (or senile) indiscretions go viral, that’s also fine by me. Any Facebook user who makes naïve assumptions about the privacy of the personal information they entrust with Mr Zuckerberg has only themselves to blame when their silliness, vanity or narcissism comes back to bite them in the bum. I have no problem with Facebook or with people who use it, for I am one of them. But for every benefit it delivers I can see a downside. Personally, I don’t care if it lives or dies. The same goes for Twitter, by the way.
Personal data: when I say I don’t care about the fact that information about me is sitting on a thousand databases, many of which are increasingly talking to each other, it’s largely because my life is so unremarkable that I can’t conceive of anyone finding data about me to be remotely interesting. I might feel differently if some government of the future marks me down for euthanasia once I reach a certain age. The problem is that we signed our pact with the devil when we started using computers, and with the arrival of the internet, the pact has become a tender embrace. I do object to having my identity stolen and my credit cards cloned, and I don’t like the idea that someone’s listening to my phone calls and storing my emails. But I’m afraid that these are realities we will find it difficult to roll back. So the sensible approach is to be aware and take evasive measures. And don’t forget to vote out any bastards that want to take things too far. Thank goodness we can still do that in many countries.
Personalised marketing: I’m not fine with spam, which is not in any way personalised except in as much as some criminal has got hold of my email address and wants to sell me Viagra or scam me out of money. But by and large I am OK with companies like Amazon trying to sell me things on the basis of my previous purchases. The point is that with most of these guys you can opt out and unsubscribe. If you can’t, it’s spam. And yes, I do get a lot of email from organisations that are getting ever smarter at hitting my personal spots. But I really don’t care that much, and spending an hour or two every month clearing them out is one of those therapeutically mundane tasks I was talking about earlier.
Climate data: I can see a few of my beloved readers bristling when I say I don’t care about climate-related data. Wait. This is not the same as saying I don’t care about the consequences of climate change. I surely do. But I’m fed up with swivel-eyed followers of the true faith bombarding us with stuff that turns out a month later to be either false or just half the picture. So while I’m always interested in the detailed science, I only pay attention to the balance of probability – informed, of course by data and spiced by a measure of common sense. Maybe that has something to do with the likelihood that I won’t be around to experience the worst case. That doesn’t stop me from supporting mitigating initiatives. But I do so on the basis of logic rather than faith.
Health data: if I paid attention to the theories related to the causes of high cholesterol, I would have spent the past few decades oscillating between states of fear and relief. As with climate change research, medical data is a moveable feast. I read about it, study it and move on. I keep swigging back aspartames and cooking my omelettes with butter as Atkins, Dukan and all the other diets rise and fall. I can’t stand Flora margarine, and did a little jig when a study claimed that it was just as detrimental to health as butter. One day, no doubt, my arteries will remind me of my folly. I pay more attention to my own experience and perception than to the barrage of advice I encounter from media of all kinds. Remember the joke about the accountant who, when asked by his boss what 2+2 equals, answers “what would you like it to equal?” With health data, if we look hard enough, we can always find a conclusion that suits our purpose. So why bother? Just leave it to fate and the balance of probability.
I’m sure I could come up with a dozen more examples along these lines. But I’m far too busy attending to valuable activities like washing the dishes, reorganising the directory on my laptop and pondering the future of mankind. By the way, has anyone got around to defining what small data might be? Lists would probably be a good example.
Your thoughts welcome.