Neo Technology CEO: What is a graph database... and why big data needs one

CEO of Neo Technology Emil Eifrem guest blogs below for the Computer Weekly Developer Network to explain what, really, we mean by the notion of the graph database.

TechTarget’s own definition states that a graph database (also called a graph-oriented database) is a type of NoSQL database that uses graph theory to store, map and query relationships — but is it worth hearing a definition laid down by industry too?

1Emil Eifrem_Neo Technology_2.jpg

A brief history of (database) time

Back in the mainframe era there were a huge number of different types of databases – all with different ways of organising the data on disk. But by the time we got into the 1980s, one model became dominant.

Enter SQL: with data organised into tables – think Excel – it was initially very successful, mapping very well with the majority of business applications available at the time.

They might be (data) giants

This was a time when the tech giants ruled. With Oracle, IBM and Microsoft in the game you could choose your vendor but the method of structuring the data was not up for debate…

… it was the SQL way, or the highway.

So why are we moving away from this trusted model?

Put simply, data no longer works as part of a one-size-fits-all strategy. With the arrival of big data we are no longer talking in Mb or Gb and we’re certainly not talking about structured information.

Businesses are collecting vast streams of data about anything and everything, often without much thought on how it will be managed, analysed or even stored. Trying to push these huge, irregularly-shaped data sets into the traditional SQL model is painful.

Not Only SQL

Hence we have the ‘Not Only SQL’ movement (also known as NoSQL).

Within NoSQL we have real choice over how data is structured, each model offering various strengths and weaknesses.

Ed — exactly ! .. as recently explained on Forbes: “NoSQL is argued to be shaping our future because, as a database type, it depends on data structures that can (for certain use cases) operate faster than traditional relational databases. The NoSQL data structure taxonomy is defined by key-value stores, documents or graph databases. In other words, the database design can be structured around what can be a more custom-aligned DNA for the use case in hand.”

Eifrem continues…

Graph databases are part of this movement. Focusing on the relationships between data-points, rather than on the values themselves, graphs are perfect for those big, messy and connected data sets. This is something that SQL databases simply can’t do – at least without spending significant effort creating complicated join tables.

With the graph you can ask complex and abstract questions that look beyond the first data connection.

They can uncover patterns that are difficult to detect using traditional representations such as tables. It may be a social graph; it may be going from point A to point B; or it may be product recommendations, where you want to know what else was bought by the people who bought similar things to you.

Importantly, understanding the connections between data, and the meaning of these links, doesn’t need new data. You can pull new insights existing data, simply by reframing the problem and looking at it in a graph.

CWDN notes: about the author’s firm

Neo Technology is the creator of the Neo4j graph database that brings data relationships to the fore — the firm recently announced Neo4j 2.2, with major updates to derive maximum value from the data relationships. Enhancements in Neo4j 2.2 include a new Cypher cost-based optimizer and the addition of a new in-memory page cache to improve application read performance and scalability.


Image courtesy of Neo Technology.