History lesson points way forward on data management projects

Data management projects often suffer from historic amnesia. Projects that co-locate business and IT staff and keep to a manageable scale can achieve remarkable results, says consultant Andy Hayler.

Edmund Burke, the 18th century Irish politician, author and philosopher, famously said that those who don't know history are doomed to repeat it. I have run or been involved in numerous large-scale data management projects, and recently I have observed a number of them whose managers would do well to look back at some lessons from projects done in the past.

Many moons ago, I was involved with a “project office” function at Shell and spent some time immersing myself in project management and the theory, such as it was, of how to estimate the time it would take to complete projects. I was intrigued to find that much of the theory was quite old, having been based on detailed observations of projects carried out in the 1970s and 1980s. The techniques highlighted in the sometimes impenetrable tomes describing the theory were mostly overlooked in the “modern” project management courses that I had attended at university. Indeed, some of the estimating theory seemed quite counterintuitive, and I was rather sceptical about how accurate it would be on projects using up-to-date technologies.

For more best-practices advice on data management projects

Podcast tips for managing a data management team

More podcast tips for BI project management success

Learn about the role of the data architect in data management projects

In particular, there was one somewhat obscure formula for estimating the length of projects that had a relationship between the number of work months required and the elapsed time to complete the work. This formula predicted there was an optimum amount of time for a certain size of project and if you tried to artificially compress it, the effort needed went up exponentially.

I was rather dubious about this until I was able to see two essentially identical Shell data management projects being carried out in different operating units, both rolling out the same software and both of a similar scale. One had a reasonably relaxed 13-month time frame that fit the formula well; the other was operating on a slightly compressed time frame of 11 months, imposed to fit it in with another project. The formula predicted that the second project would take not just a little more effort, but almost double the number of work months.

To my surprise, that was exactly what happened: The second project quickly fell behind schedule, and to make up time more contractors were added, then more, then more. By the end, the project costs had doubled.  As more resources were added to the project, productive workers were taken off tasks to train the new people and then more meetings were needed to communicate what was going on to the larger team. This slowed things down and still more resources were added to catch up, creating a vicious circle. It was a textbook example of how ignoring old project management history can turn out to be costly.

Small projects good, large projects less good
In general, the estimating theory predicts that large projects are far less productive than smaller ones. When there is a team of a half-dozen people  all in the same room, communication is easy: you just lean across the desk and chat with a colleague about some issue of software design. Once you need a larger team -- and especially if the team is no longer co-located but split into different rooms or locations -- productivity plummets as communication becomes more complex and maintaining a coherent sense of the project goals gets tougher.

The key lesson is that giant projects often struggle. Where possible, you are better off splitting up large projects into smaller component parts and delivering them incrementally, with each smaller project team likely to remain reasonably productive. I have observed some extremely large (over $50 million) master data management (MDM) projects successfully follow this “think big, start small” mantra, breaking up a global rollout into numerous smaller-scale projects that deliver MDM capabilities for one geography or one data domain at a time. That takes a longer amount of elapsed time, but less work is required and there is time to correct early errors.

By contrast, I saw the opposite approach recently at a large US company. They had decided to fix their master data problems in one giant gulp, simultaneously tackling all major data domains and geographies, all to be driven centrally by a large team. The project team ended up at over 150 people in size, which was a fiasco. The business requirements turned out to be more complex than initially realised, and it was tough to maintain project momentum despite throwing yet more consultants at the project. After 18 months, it had manifestly failed to deliver anything of value. Eventually, to save face, a vastly cut-down deliverable was “made live,” but it was an application that remained essentially unused.

Not every project can be neatly broken up into smaller chunks, and sometimes there are external constraints on deadlines, such as government legislation. Yet time and again, I have conversations with people working on large projects that are struggling and find that they are guilty of easily avoidable sins. A common one is having a sketchy or non-existent business case.

And in many cases, a data management project is led by a centralized team following a waterfall project methodology, not because the project is suited to that style but because the methodology was blindly imposed by the systems integrator that won the bid for the implementation contract. In addition, the project team is split between the client office and a group in an offshore location “to save money,” yet the time difference means confusion between the project groups, causing endless rework as the chain between the business user, the business analyst interpreting requirements and the software developer grows longer and more convoluted.

We can do better. Data management projects frequently need a lot of iterations, as end users are hazy about their requirements until they see a prototype and then realise they wanted something slightly different. Projects that co-locate business and IT staff and keep to a manageable scale, delivering a project in small incremental chunks, can often achieve remarkable results. None of this should be surprising, since the theory behind it is buried away in a series of dusty and hardly read books from the 1980s. Yet all around the world, projects are kicking off right now following a rigid waterfall methodology to an artificially tight deadline to fit in with some perceived urgency on projects that would be better suited to an incremental delivery approach. Perhaps a little reading of some history is in order.

Andy Hayler is co-founder and CEO of analyst firm The Information Difference and a regular keynote speaker at international conferences on MDM, data governance and data quality. He is also a respected restaurant critic and author (see www.andyhayler.com).

Read more on Master data management (MDM) and integration