Anton Ivanov Photo - stock.adobe
La Liga, Spain’s premier football competition, is using technology from Databricks to analyse player performance and offer personalised experiences to fans.
The data scientists at LaLiga Tech, the technology arm of the sports association, are working with the Databricks Lakehouse data architecture.
The so-called data lakehouse is a portmanteau that joins together the data warehouse and the data lake. It is said to deliver the data management and performance typically found in data warehouses with the low-cost, flexible object stores offered by data lakes.
A Databricks statement said the architecture provides the “foundation for its [La Liga’s] data strategy and cuts away at complexity”.
LaLiga is the way the division is styled under the sponsorship of Santander.
In 2021, the association launched LaLiga Tech as a company, bringing together La Liga’s technology systems and selling them externally, beyond football, to other sports, as well as other media and entertainment industries.
Guillermo Roldán, head of the architectural department at LaLiga Tech, said: “We are creating a world where data informs almost every aspect of how sports are played and experienced. The Databricks Lakehouse Platform has been transformative not just in our ability to analyse game-play, but it has also been part of the foundation of the whole LaLiga Tech business, which is democratising data for the whole industry.
Rafael Zambrano, LaLiga Tech
“Our ecosystem of digital services is not just changing football, but other sports around the world, as well as leading media companies. I think we are just at the tipping point of what can be achieved with data and AI.”
Databricks said the lakehouse allows LaLiga Tech to access data in a single data lake and perform artificial intelligence (AI), machine learning (ML) and business intelligence (BI) on a single platform. This means the tech team can access data, create models and create statistical valuations at a click rather than having to spend time extracting and downloading data from a range of databases.
Rafael Zambrano, head of data science at LaLiga Tech, said: “Working with Databricks has completely changed our ability to access data through one unified environment, enabling ML at scale unlike ever before.”
The data team at LaLiga Tech is using data and AI over three broad areas. One is in match statistics and in-play analysis, based on data from cameras in each club’s stadium. The data team uses two main types of data: one is “eventing”, which are passes, shots on goal and tackles; the other is data extracted from tracking the ball.
This information is available through LaLiga Tech’s Mediacoach tool. It allows data scientists at the clubs to perform pre- and post-match analysis and predict player injuries before they occur.
As well as delivering this information to clubs, another area LaLiga Tech’s data scientists are working on is investigating new ways to visualise the information for broadcast audiences, one example of which is a “goal probability” metric shown in televised matches.
The third area lies in synchronising unstructured, semi-structured and structured data together at a large scale, running ML models that work from gigabytes to terabytes of data, in one environment, globally, and in real time.
The English Premier League has made similar use of Oracle Cloud for in-match statistics for fans.
The application of data analytics in football, and in sport more broadly, is a popular and widespread endeavour. In 2014, SAP claimed its analytics had helped Germany win the Fifa World Cup. Individual club sides have used analytics for player recruitment and valuation, injury prevention and fan engagement, as well as performance.