The latest installment in the datacentre mysteries - in which Hadoop has long been the prime suspect for writing off the relational database (RDB) - is out now.
Now a mysterious figure from the past has emerged, and claims to be the rightful heir to the original RDB. It looks like there may be a sequel to NoSQL. In fact more than one....
The RDB is a stubborn old bugger. Built over 30 yeas ago, when you never had more than 500 users, ever, and they were all prepared to stand in line for their transaction to be processed, by a single machine that did each calculation in turn, the SQL RDB ruled in the mainframe age.
It should be extinct by now, given its complete unsuitability for modern commodity datacentres. This poor old dear can't multitask (some have dubbed it the manframe) and it doesn't like sharing house space in the cloud. It's too old for that sort of thing. The RDB liked a bit of old fashioned number crunching and was a serial transaction processor. But it lacked the skills of analysis needed for the modern world. Today you can have 100,000 users all simultaneously accessing their records and your rackspace can go down as well as up. It was the lack of insight that doomed the RDB, though.
Meanwhile, there were some aggressive youngsters desperate to take over the RDB's prime property in the glass house.
When the RDB is finally bumped off, there will be plenty of suspects. Sybase/Hana have got form, with the most brutal taming of big data in memory.
Then there's the Hadoop crowd, such as CloudEra, with their aggressive mantra of NoSQL, by which they are determined to take over the database market, with their promises of real time transaction processing and big data analytics. If RDB went missing from the datacentre, the first one to be called into Columbo's office for questioning would be Amr Awadallah, founder and CEO of Cloudera and an open source advocate. “My wife thinks you're terrific, Mr Awadallah. But just one thing. Didn't you say that that you were going to blow RDB out of the water? What did you mean by that?”
As so often happens with these open and shut cases, the most immediate suspect turns out to be a red herring. Yes, CloudEra made a good case for parallel processing, but it soon emerges there was another party with a motive.
In Act Two (after the ad break) we find our detective grappling with a hitherto overlooked candidate who, on closer reflection, had a much stronger motive. Jim Starkey and RDB go back all the way. Starkey created databases for DEC when they were the Digital Equipment Company and data was backed up onto giant tapes. Happy days. But somehow, he and RDB fell out (he wanted it to be more flexible, but the old bugger refused to budge) and so Starkey and his accomplice Barry Morris hatched a fiendish plan. They intended to design a new variation of the database, which could be replicated onto hundreds of machines, could multi task and was completely fault tolerant.
With fox-like cunning, they devised a plan so that traditional databases could be slowly migrated onto this new model, without anyone noticing. It was the perfect plot, admitted Morris, the CEO of this joint enterprise, NuoDB, when I questioned him in Amsterdam at Gigaom.
“It scales upwards and outwards and overcomes the limited processing options open to traditional databases,” said Morris. It's perfect for the new environment of commodity data centres and fluid computing resources, which can ebb and flow in accordance with the tides of ecommerce sites.
“This new model is fault tolerant, faster and cheaper,” said Morris, “datacentres could use fewer people to run more systems. It would save a fortune on back ups, performance management and storage. And you can migrate the old database onto this.”
And they would have got way with it too, if it wasn't for a young startup, SQLstream, sticking its nose in where it doesn't below. Damian Black, its CEO, might have accidentally disrupted the perfect disposal of RDB, by creating a real time analytical system that doesn't need to number crunch big data. Instead, it intercepts live data streams, in real time, and creates insights from them a lot quicker. If they used it to analyse tweets about traffic jams, for example, they could avoid hundreds of millions of snarl ups on the motorway by tipping people off quicker than those ludicrous outdated 'information' notices they have on all our major trunk roads.
How does SQLstream do this? I've got no idea – that's all I've got on them. But I suggest you pull Damian Black in for questioning. He seems like a nice man. Mind you, they all do, at first...