Why Graph Needs an SQL

This is a guest blogpost by Neo4j co-founder and CEO Emil Eifrem, who explains why his sector needs its own ‘General Query Language’.

In the same way that SQL helped facilitate the growth and subsequent domination of RDBMS over the database market for the last two decades, to secure its next stage of enterprise acceptance, graph software needs to create a standard property graph query language.

Why? Because put at its simplest, a common query language could open the doors to graph technology being used more widely in exactly the same way the lingua franca of SQL propelled relational to the summits 30 years ago.

The main reasons why are detailed in an open letter we, as Neo4j, are publishing for the entire database world, the GQL Manifesto, and I’d like to summarise why here. Basically, there are two groups that need to come together to achieve a common graph industry language. There is industry, like Neo4j with Cypher and Oracle with the PGQL language, and academia. The property graph researchers in academia have moved on from query languages like XPath for querying documents or SPARQL and RDF, and are now interested in the wider property graph context.

A next-generation graph query language

Over the last 18 months there’s been a lot of behind-the-scenes activity on unification that led to the Manifesto and there’s a lot more commonality we can build on. Oracle has achieved some great innovations with PGQL, and PGQL is very close to Cypher, with the two languages coming from the same fundamental design approach. That means that you’ve got the primary designers of Cypher and PGQL already collaborating with each other (for example, on limited graph querying extensions to SQL) and with researchers on what a good, next generation graph querying language needs to look like.

So what we need to do is to take all that intellectual ferment and energy and turn it into actuality. There are two key ways to do this. One is making it even easier to express complex patterns in a very concise way in a graph database – regular path queries, and regular expressions, which can be used to define the pattern of things you want to look at. Second, there’s composable graph querying, where you glue together queries and one query can feed off the output of another, using that output as its input.

Put that together and we have a GQL at least as good as SQL’s first implementation – and we need to find a way to make that happen. Many graph database practitioners have, I believe, come to the conclusion that these different languages should come together and be the inputs to a new unified language for the whole market.

Hence the manifesto campaign – a joint, open, collaborative initiative where we want everyone to make a contribution. Including developers or managers or line of business users looking for the kind of innovation and creative data modelling you can only get from this form of database technology.

If you agree with me and graph database advocates like us that the time is right to create one standard property graph query language, then we would love to receive your vote. Find out more at The GQL Manifesto main website [https://gql.today].

We hope you agree – and we can add your voice to the discussion.

The author is co-founder and CEO of Neo4j, (http://neo4j.com/)

Data Center
Data Management