One of the major attractions of sport is its sheer unpredictability. On any given day, a team that operates on limited financial resources and has players of modest technical ability can beat one of the best funded and most skilful sports teams in the land. Witness the recent examples of League Two Bradford City toppling Premier League outfits Arsenal and Aston Villa in the Capital One Cup, or Italy triumphing 23-18 over France in last weekend’s Six Nations.
Predicting the outcome of a contest with 100% certainty is incredibly challenging, but is it impossible?
This is the question that IBM is looking to address with the launch of new predictive analytics software, developed in association with the Rugby Football Union (RFU), called TryTracker.
Devising a formula
Over the last few months, the company’s predictive analytics team has been mining data from historical and current rugby matches to see if it can devise a formula that enables it to forecast the outcome of a game.
The analytics work builds on IBM’s success at tennis grand slams with its SlamTracker technology, which maps a match in real-time, highlights key turning points in games and also pinpoints the three things a player must do to increase their chances of winning.
But examining key events in an individual sport like tennis is relatively straightforward when compared with the complexities of a team game like rugby, so the first thing that IBM did when it embarked on this project was gather a large set of historical data from sports data company Opta.
“We began by mining that data to prove to ourselves that there was something useful that we could do in terms of descriptive and predictive analytics,” explains Matin Jouzdani, big data lead (UK) at IBM.
This entailed looking at how the England rugby team usually perform and play then drilling down into how the side performs against specific opponents. From that, IBM was able to draw up a series of "keys to the game" for England versus Scotland in the Six Nations.
“The keys to the match are statistically proven to be the things that each team needs to do to have the best chance of winning,” says Jouzdani.
For England, those keys were to win more than 14 turnovers, get more than five line breaks and achieve a successful goal kick percentage of more than 74%. For Scotland, the keys were to achieve a tackle success rate greater than 95%, win more than 85% of their own lineouts and have more than six attempts at goal. England hit two of their keys – achieving 19 turnovers and a kicking percentage of 87.5% – whereas Scotland failed on all of their keys in a game they eventually lost 38-18.
Although Jouzdani was pleased with the accuracy of the forecast he adds “we would rather have thousands of fans enjoying the experience of following the keys, than to have a tick in the box saying that they were spot on in terms of accuracy”.
More on predictive analytics in sport
Enhancing supporter’s enjoyment and understanding of the game is also why TryTracker has a number of other built-in analytics features, including a momentum chart which provides a visual graphic of the momentum of a game as it unfolds.
“The chart identifies key moments in the game, be it a line break, a knock or a missed tackle,” explains the RFU’s head of digital Nick Shaw. “We plot those moments on the momentum chart and we’ve also added an editorial overlay into it so either during the game or after, users can dip into the momentum chart and see how important that key moment was.”
Interestingly, given that at times Scotland appeared to be coping quite well with the England attack, IBM’s visualisation showed that at no point were Scotland more dominant than England in terms of momentum.
The TryTracker software also determines key influencers, by analysing every action that each individual player takes in a game and revealing which three players had the biggest positive impact on their team’s performance. Developing a formula to accurately compare like-with-like on the field of play caused a lot of debate, says Shaw.
“Obviously the performance metrics of a prop is very different from a winger so you cannot compare them like for like. What we’ve done is develop a rating system and the criteria is different for each position, based on position-specific historical data. If these individual scores are high in relation to the team average, then the player is going to become one of the key performers.”
Game of two halves defies predictive analytics
Despite the success of TryTracker in its first outing, Jouzdani concedes that when it comes to sports, predictive analytics is not an exact science.
Software cannot take into account external factors such as the weather, the fitness of the players and their mental wellbeing. He cites the example of a recent trial that IBM ran with Sky Sports around two big Premier League football matches in January. The first was the London derby between Chelsea and Arsenal during which both sides only hit one key, with Chelsea eventually running out winners 2-1.
“It was a classic game of two halves and a great example of how the unexpected can control what otherwise would have been a different game,” says Jouzdani. “Chelsea’s opening goal was somewhat unfair in that there was a foul that wasn’t spotted by the referee. They went 2-0 up shortly afterwards when the Arsenal goalkeeper gave away a penalty and you might argue that the mentality of the players wasn’t quite right due to the nature of the first goal.”
One of the keys to the game for Arsenal was keeping a clean sheet in the opening 35 minutes of the match, because IBM had spotted a correlation between the side not conceding in the first 35 minutes as a major factor in the outcome of the club’s matches. As it turned out, Arsenal were 2-0 down after 16 minutes thanks to the controversial opening goal and a penalty.
The other game that weekend that was scrutinised by IBM’s predictive analytics software was Tottenham Hotspur versus Manchester United, which also threw up interesting results.
“To use a 'technical' term, Spurs were absolutely battering Manchester United, but they were failing to hit their keys whereas United, despite not having their fair share of play were hitting theirs [going into injury time United were 1-0 up],” says Jouzdani.
“In the end, Clint Dempsey popped up in the last minute and ruined that story for me, but it just goes to show that even if you use the best pre-match analysis and predictive engines the outcome of a game can still go another way and that’s one of the wonderful things about sport.”