Blame the Poisson

I recently met Mark Rodbert, CEO of Idax Software, who has an interesting theory on statistics. We often see the ‘Normal’ bell-shaped distribution – where the top of the bell represents the most likely outcomes, and the left and right tips (outliers) are rare events. Rodberts believes real world events are more likely to follow a Poisson distribution – and this has implications for IT. In this guest blog, Rodbert explains the theory:

At idax we spend a lot of time demonstrating that maths really can help describe the real world. As idax uses mathematics to identify individuals with unusual access it’s pretty important that our clients share our understanding.

Of course, people are used to getting on planes, making a phone call or using Amazon, all of which require pretty sophisticated analytics, but in the realms of big data some things are still counter intuitive. If we got two sales leads last week and 1 the week before we’re on an upward trend, if my train was late twice last week, it will be late this week, and most importantly for us, if I find several people with a high risk profile in their access then someone must be someones fault.

London 2012 - Mo Farah

London 2012 – Mo Farah (Photo credit: garda)

But how likely really are these events. Well it turns out that what we need is not someone to blame, but the Poisson distribution. The Poisson is a very versatile statistical tool rather like a lopsided normal distribution, that is good for estimating event frequency, especially if the events are rare. And my all time Poisson concerns the distribution of gold medals for Team GB at the London Olympics. It seems strange to remember that at the start of the games we went a whole three days without a British gold medal. As the press shrieked that we were heading for disaster, unable to meet our targets despite massive investment, the nation held its breath. So what really were Mo Farah’s chances?

Well, as we all now know, actually pretty good. Of course only an idiot would assume that winning 29 medals over 16 days should equate to 2 every day with Sundays off, but how likely was a medal-less day. Well if you assume a Poisson distribution and take an average of 1.8 a day, the chance of a day with no medals is 16%. The chances of a super Saturday with 6 medals were actually 7%.


The bad news is that, as you can see from the chart above the Poisson doesn’t quite fit what actually happened. The good news is that a day without any golds was actually more likely at 38% of all days. The least likely (below 5) was a single gold day, which only happened once. The last day of the boxing, since you ask. So why does any of this matter? Because it shows that human beings are very bad at estimating how frequently things are likely to happen. We assume that events are evenly distributed and get confused when they’re not. Not much of a problem with gold medals; quite a big problem when you’re tying to detect fraud, rogue trading and high levels of access risk. We assume that because unusual failures are, well, unusual they are also uniformly infrequent.

So when it comes to Access and Identity Management its clear that an approach that defines cumulative controls by exception management, otherwise known as “my boss checks my access” – will perform well with the frequent but not so bad but does nothing to stop the infrequent but high risk. So the good news is that if you ask your staff why they have access to something you’ll probably remove a few copies of Visio, but you’re unlikely to spot the guy with access to the general ledger and the payments system who’s ripping the company off. Which just goes to show that what companies need is real analytical capability, and of course a bit of mathematics.

Mark Rodbert is CEO of Idax Software, the identity analytics software provider