Corporate IT’s new vocation will be data integration. Mark Madsen, president of consulting firm Third Nature, will tell delegates at the London TDWI Business Intelligence (BI) Symposium next week, in his keynote speech, that the business of big data will change the function of the IT department to be less about technology and more about information architecture.
Madsen will, he said speaking ahead of the symposium, cast a few swipes in the direction of big data zealots who think their activities “unprecedented”. We need to put precedence back in again. “It is a pity they don’t teach the history of science in science programmes,” he said.
If the history of previous information explosions were better understood, he said, IT and information professionals could abstract the problem better: “We tend to substitute technology for architecture, the concrete for the logical and abstract”.
Madsen also said the business intelligence supplier community has ended up “architecting for its convenience than for what people need. BI still only gets 17% to 24% penetration in a company. That is a sad commentary, and it is because [the vendors] are looking at it through the same lens as at the birth of the data warehouse, 25 years ago."
He added: “Go back 25 years. We had five applications in the enterprise and four were financial. We had limited data and resources. And those resources, in terms of time and people, were insanely expensive. We had million dollar systems with 500GB of data. The world the data warehouse was born into was a totally different world”.
And we now have big data technologies, based largely on emanations from Google, Yahoo and the social media companies, such as Hadoop and NoSQL databases.
For more on new data architectures
On the business intelligence side of the data warehouse/BI dyad, he said the newer data visualisation vendors, such as Tableau Software and Tibco Spotfire, while building on OLAP architectures from the mid-1990s, are at least tackling “the use case of ‘I don’t quite know what I am looking for, but need to hunt for it’”.
Old school BI was always too dictatorial, he said. “You had to have a bounded question that fitted into the rectangular borders of a report. The traditional BI vendors have been locked into a SQL call and response to a database. But they are waking up, and will win back market share to the extent that the data is in a data warehouse, as long as they get the HCI [human computer interaction] part right”.
As business intelligence moves away from its core of finance, it has to respond to different needs in work groups on the edge of the organisation, he said. Mobile BI necessitates different interaction models, he said: “It’s not just about smaller screens, but about what you do with information when you are in the middle of a process looking at things – on the factory floor – and not sitting looking at a screen on a desk”. In a mobile context, you need a custom application, not so much to query data, but to get alerts: “That’s all new”.
Madsen takes the view that Oracle, IBM and Microsoft are in an architectural dead end when it comes to databases: “That model of building a universal database is gone, for the time being at least. Parallel, shared nothing, domain specific databases solving domain specific problems is the conventional wisdom”.
As for the big data realm, there is, he said a quandary. “Do we have SQL semantics moving over into MapReduce land, or do we have two separate, distinct environments. And it is not clear yet what will happen. You have Greenplum, with its SQL on Hadoop version. You have Teradata, with Aster, with MapReduce wrapped under SQL. And then there is the big data crowd, who think they will take over the world”.
The data warehouse does what it does well and is not going to go anywhere. But it is not architected very well for the future
Madsen sees the future of corporate IT, in this new data environment, to be more architectural. “The data warehouse does what it does well and is not going to go anywhere. But it is not architected very well for the future”. There is an analogy, he said, with the shift from Web 1.0 to Web 2.0. There, allowing web browsers to do RPCs [remote procedure calls] eliminated the application-server model, making the web less broadcast and more interactive.
“A lot of the big data stuff, seen as data infrastructure, is, similarly, about reconfiguring [components of] the architecture, putting the data warehouse in place as one piece of a larger picture.
“The role of IT changes with cloud infrastructure, software as a service, applications moving outside, and so on. And so we are not managing servers, or databases, or applications (other than some security administration). Our job, as IT, revolves entirely around one thing -- data integration”.