Patryk Kosmider - Fotolia

How Kognitio mined the reality gap on London bus time tables

The work was inspired by TfL’s open data policy and could spark ideas in other channel partners Nick Booth argues

One of the world’s biggest regional authorities, Transport for London (TfL) is running an open data policy which could be an unmissable business opportunity for channel partners. One of the early movers, big data Kognitio, has exemplifed how the data sets made available to developers can be put to creative use, with a report on the London’s Worst Bus Stops.

Kognitio went to TfL’s open data area and downloaded 4,948,534,706 data points. After digesting information from 19,687 individual bus stops,  675 bus routes and 9,641 buses, it has identified London’s least reliable bus stop sign: Ringway in Zone 4. If you are in this part of West London and getting a bus towards Southall, you really don’t want to believe anything you’re told, because this stop only sees 3.6% of its buses arrive on time, based on the timetabled schedule.

Kognitio has identified the Best and the Worst places to be waiting for a bus in London - judged on the criteria of time table accuracy. Elsewhere, other studies have separately identified which streets you are most likely to be mugged in. I wonder if these two reports can be cross referenced. 

The report also reveals the reliability of each bus route and Kognitio provides a tool that checks the reliability of buses in each post code against that of the rest of London.

Kognitio pulled the data from the TfL API to test how fast it can run queries with Tableau as an interface to data held on Hadoop. The output is fascinating because it contrasts how TfL measures reliability, says CEO Roger Gaskell: “They use predicted times on boards to say whether a bus arrived on time, not by the expected number of buses in a given period.”

Many bus routes don’t fulfil their promise, according to data analyst Chak Leung, who ran the study. “They’re not actually as punctual as I initially thought,” says Leung.

Still TfL’s bold initiative is inspiring people to help improve its service. It was launched by TfL’s chief data officer (CDO) Lauren Sager Weinstein, who was recently named by Data IQ as the 7th most important person in the information industry.

“We’re championing communities of developers to take our feeds and use our open interfaces and create benefits for everyone,” says Sager Weinstein.

Weinstein was involved in the introduction of the Oyster and Contactless card payment systems which had a noticeable effect on bus journey times, because they eliminated the coin fumbling delays caused at every single bus stop on every journey.

The decision to provide free, accurate and open real-time data by TfL has boosted London's economy by £130m a year, according to a study by Deloitte. That figures is an estimate of how much of our collective time has been saved by quicker journeys. It also takes in the new inventions and commercial opportunities created by access to information about everything from air quality, to traffic numbers to journey times.

TfL’s infrastructure is an immense network that includes trains, trams, boats, buses, roads, light railways, walkways and even a chairlift-cum-airline. Integrating all these is a massive undertaking, given that much of the infrastructure is centuries old and most of the forms of transport were discrete systems with their own methods for recording information.

We can all think of improvements that need to be made. I’d like to develop an automatic cattle prod gives dawdlers a 5000 volt reminder that they need to move down the train to let other people on.

There’s plenty of scope for more creative uses of data, surely.

Meanwhile, TfL’s Sager Weinstein is busy trying to galvanise another sleeping giant into action. Huge sections of the British public never seem to think about a career in IT. Women don’t seem to be very enthusiastic about IT as a career and girls don’t seem to want to study it at school. Students from disadvantaged social groups seem to lack the confidence to aspire to an IT career too. This is a tragic waste of talent for the UK IT industry, which is handicapped by understaffing. It’s also a sad waste of an opportunity for the young students too. They need jobs, and we need more people to join the industry. This is something Weinstein is working to rectify, touring schools and seeking to inspire the students.

By visiting schools across London and showing Year Five students how data is relevant to their journeys, Sager Weinstein aims to attract a more diverse stream of talent and perspectives into the industry. Maybe this new bus stop study will engage the kids and get more involved.

Read more on Enterprise Storage Management