Yugabyte: Decoding the claim of PostgreSQL compatibility

This is a guest post for Computer Weekly Open Source Insider written in full by Karthik Ranganathan, co-founder and CTO, Yugabyte.

Yugabyte is known for its distributed SQL database for what the company calls Internet-scale cloud-native applications.

Ranganathan writes as follows…

PostgreSQL has been in active development for over 30 years and is still one of the world’s fastest-growing and most adopted technologies.

Its popularity, cost-effectiveness and dependability (plus a loyal and vocal community) have promoted many database vendors to build modern, cloud-native solutions based on this respected open source technology.

But what does PostgreSQL compatibility mean?

PostgreSQL offers extensive advanced features, a mature enterprise-proven API and a rich ecosystem of tools, extensions and frameworks. A truly PostgreSQL-compatible database should offer similar capabilities and integrate seamlessly with this rich ecosystem, as well as support applications developed for PostgreSQL, by implementing a similar set of features, functions and syntax.

Many databases claim to be PostgreSQL-compatible, but it is important to realize that this does not necessarily mean that the database will always behave like PostgreSQL, or that a PostgreSQL-based application will run smoothly without changes or updates.

Before adopting and running a PostgreSQL-compatible database, you should be clear on what this claim means and how compatible the database really is.

Four levels

I would like to propose four levels of PostgreSQL compatibility. Databases can fall anywhere on the compatibility spectrum, so it’s crucial we understand these levels and how failing to meet them could impact ours applications.

Wire-Protocol Compatibility

The lowest level of compatibility is wire-protocol compatibility. PostgreSQL comes with its own protocol that is used to exchange commands and execute queries between PostgreSQL and client connections. This compatibility level is required to enable communication between PostgreSQL client drivers and the database. It is a network-level protocol and comes with its own set of commands that a client needs to follow in order to communicate with the database server.

Being wire-protocol compatible doesn’t mean that developers can create applications using PostgreSQL language and tools. It only means that drivers (used by client applications), or tools created for PostgreSQL, will be able to establish a connection with a PostgreSQL-compatible database and exchange commands/queries following PostgreSQL format.

A PostgreSQL-compatible database will be able to parse those commands/queries, but, it doesn’t mean the database will be able to execute or recognize them, as that is only possible if the database meets the next compatibility levels – syntax and feature.

Syntax Compatibility

Syntax compatibility is the next level and it centres on the parsing and understanding of PostgreSQL syntax by the database.

Upon receiving a valid PostgreSQL command or query, the database should be able to parse and execute it (or send a proper exception if a PostgreSQL feature is not yet supported).

This compatibility level defines to what extent a database supports the PostgreSQL DML and DDL syntax. The higher this compatibility level, the fewer code-level changes an application requires.

Feature Compatibility

Feature compatibility is the third level and determines to what extent the database can support core, as well as advanced, PostgreSQL features.

Applications can use different PostgreSQL transaction isolation levels, triggers, views, stored procedures, foreign data wrappers and other extensions. The more features a PostgreSQL-compatible database supports, the fewer code changes you will need to introduce.

Runtime Compatibility

Runtime is the pinnacle of PostgreSQL compatibility. This level of compatibility is crucial if your new database needs to look and behave just like PostgreSQL.

A runtime-compatible database will match PostgreSQL execution semantics at runtime. It supports queries to the system catalogue, error messages, error codes and all isolation levels. Runtime compatibility automatically means the three other compatibility levels apply and that developers will be able to continue to benefit from the ecosystem of drivers, libraries and frameworks created for PostgreSQL.

Developers familiar with PostgreSQL will feel just as comfortable using a runtime-compatible PostgreSQL database.

Closing thoughts

The power and popularity of PostgreSQL are beyond dispute. Many database vendors want to capitalise on PostgreSQL’s reputation, claiming a compatibility that’s tenuous at best. This is why it’s essential that you identify the level of compatibility these databases meet and whether that is sufficient for your use cases.

Ultimately, while it’s crucial that you harness the benefits of the cloud, this shouldn’t mean that you need to spend more time, money and resources amending your applications to run correctly on a PostgreSQL-compatible database that doesn’t quite meet the demands of your business.