Information isn't power, data flexibility is power

Information is power, we already know this — knowledge is power as well, we already know this too. But we must now realise, information is data… and data flexibility is the only way to ensure we retain this power we so desire.

So then, data flexibility is power.

There, we said it, I feel better – don’t you?

But how can we prove this assertion? Pearson PLC is the self-proclaimed “world’s leading learning company” no less. OK it does own the Financial Times Group and Penguin Books. The firm launched its Plug & Play API platform last year with the intention of making its content available for third party developer innovation.

The project set about releasing some Pearson content as APIs so internal and external developers could play with it, mashing Pearson data with other APIs to create new learning products.

Pearson PLC head of technologies Diana Stepner says that one of the consistent themes that her department heard from developers is that they need flexibility to access the data.

diana fish_350x262.jpg

“Developers don’t care what database houses the data – they want to be able to easily explore and mould the data into their desired end product,” said Stepner.

She describes it as the difference between swimming in a lane of an indoor pool versus being able to dive into an open ocean.

“The worst thing for a developer’s creativity is limitations on the original data. Taking this feedback on board we’ve recently explored and implemented an alternative to our original relational, Postgres database,” she added.

After weighing up various options, Pearson settled on a MongoDB solution as the team found open source “very appealing” with more opportunity to evolve and make changes. The schema-less nature of MongoDB gave Pearson the ability to evolve the data structures with more ease than using a relational database.

NOTE: MongoDB (from “humongous”) is a scalable, high-performance, open source NoSQL database written in C++ that stores structured data as JSON-like documents with dynamic schemas. JSON (JavaScript Object Notation) is a text-based, human-readable data interchange format used for representing simple data structures and objects in web browser-based code.

Pearson’s Stepner writes as follows:

“With MongoDB, one of the basic benefits to us was that the data is stored in a binary form of JSON. Previously data was stored in XML, so transformations were required before they could be served to web developers who tend to prefer JSON. These transformations led to more complex JSON structures than were strictly necessary. Restructuring our database for MongoDB and adopting a JSON first approach has been a valuable alteration that has greatly improved the developer experience.”

“Our MongoDB migration has been very positive. That’s not to suggest that NoSQL databases will be the future for everyone – many traditional or regulated systems, such as those found in banking and finance require the referential integrity built into relational databases. But for businesses that sit on a wealth of unstructured data, NoSQL databases can provide a fascinating alternative capable of stimulating innovation and new uses for data – expanding the imagination and future business potential.”