A couple of days before the UK referendum to #Remain or #Leave the EU, we tried to look at the data and predict the outcome of the vote. In short, we tried to beat the Dart Throwing Chimp. This fella:


As data-driven humans, we collected data and analysed it. Before that, we decided on a model and assumed some stuff. The assumed stuff was based on previous experience in another referendums and votes namely:
1. The undecided majority takes its decisions over the last 48 hours before the actual vote, mostly on emotional basis
2. The opinions in social media: pro and contra, would correlate to the actual vote results
3. We would look into several other factors aside from the data that reveals opinions expressed in social media

The data indicated that, 48 hours before the referendum, the #Remain and #Leave camps are pretty equal. Let’s let the data speak:

OPinions in social

One can see the equality, but also the slight momentum of advantage the Remain camp was gaining on the final straight. We could only speculate, but it was fair to assume that a possible reason for that was the fact that on June 16th, a barbaric act occurred – Jo Cox, the Labour MP, Remain proponent, mother, pretty women in her early 40s was slaughtered on the street by a mad man from the Leave camp.

We knew that only a month earlier – in May, the Leave side had a 10 point advantage over the Remain. But this murder had the potential to swing the advantage towards the Remain side as it holds exactly the pre-requisites to influence the undecided majority through emotion.

We also looked for adjusting factors towards the bookie camp: they reported a perfect alignment of bets with the Scottish referendum to remain in the UK 18 months earlier. Based on that we made our prediction. 55% to Remain, 45% to Leave.

And we were wrong!

It turned out to be 48% to Remain, 52% to Leave.

But then, with the hindsight of the demographics that decided the vote, we looked again into the data. And this time we saw it…

Online metrics

As the vote turned out to be between young vs. old, the older citizens are neither participating in social media nor getting influenced by it. They read newspapers and follow online news outlets. And on that channel, turns out that the Leave camp was crushingly over-represented. For the last 6 months, the Leave messages generated over 30% bigger reach over traditional online channels – the astonishing 100+ billion impressions. Even on the day when Labour MP Jo Cox was assassinated, the Remain coverage couldn’t get the upper hand over Leave.

Online metrics zoom in

Leave is the light red, while the blue Remain is looking purple on the chart:

For someone trying to forecast whatever outcomes based on external data, #Brexit was one of the biggest lessons. Nowadays everybody is talking about social media listening. But now and then we get a strong proof that all data matters!