This is a guest post for Computer Weekly Open Source Insider written by Ben Bromhead in his capacity as chief technology officer at Instaclustr — a company that provides a managed service platform of open source technologies such as Apache Cassandra, Apache Spark, Elasticsearch and Apache Kafka.
After two years of anticipation, Bromhead notes that the beta release of Apache Cassandra 4.0 is now available. As a major supporter of the open source Apache Cassandra project, everyone at Instaclustr is tremendously excited to help bring this beta release to the community, so what do we need to know?
Bromhead provides us with an overview of some of the most important features and capabilities the new version of Cassandra offers and writes as follows:
One of the bigger highlights of the Cassandra 4.0 beta is the addition of new enterprise-grade auditing capabilities. Cassandra operators can now track and log database user activities This includes the ability to leverage both configurable audit actions and full query logging to audit all reads, writes, login attempts, schema changes and more.
Functionally, audit logging in Cassandra 4.0 enables enterprises to fulfill their obligations under ever-more-stringent regulatory compliance frameworks – and perhaps especially enterprises under the purview of PCI DSS and SOX – using a powerful high-level interface.
It also provides developers and operators with the ability to easily capture and replay specific workloads. Improving the ability to investigate performance and data model issues. These twin auditing capabilities empower enterprises to closely observe all database user activity, and ensure more secure, compliant and performant operations.
Arguably the biggest highlight, though, is around stability.
The development of Cassandra 4.0 proceeded with the stated goal of delivering “the most stable major release to date” to achieve a high adoption rate during this release cycle. In service of this effort, the Cassandra 4.0 beta development utilised several new testing frameworks over the course of its creation, all designed to drive improvements to both Cassandra’s stability and its performance.
These efforts have achieved their goals. Our team’s confidence in a major release of an open source data-layer technology release (and we’ve worked with many in our time) has never been higher.
Cassandra 4.0 includes increased adoption of the Netty Transport Framework throughout the codebase.
This will better facilitate communication between nodes by integrating Netty’s asynchronous event-driven networking code.
In prior Cassandra releases, there was a requirement to maintain N threads for each peer, which necessitated a great deal of context switching… much to the detriment of performance. In contrast, Netty allows Cassandra 4.0 to feature a single thread pool for all connections between nodes. At the same time, it enables SStables to leverage zero copy streaming, achieving 5x faster streaming.
Rebuilding Cassandra’s networking infrastructure has delivered major performance improvements in several other metrics as well. In initial testing, Cassandra 4.0 reduces P99 tail end read latency by more than 40%. It also allows large clusters to scale faster and easier, and substantially reduces node recovery times.
Until now, Cassandra users have needed JMX access in order to check on key details – including running compactions, metrics, clients, and certain configuration settings. Cassandra 4.0 alleviates some of the challenges of this mechanism by offering virtual tables, making it possible to query this data using CQL from read-only system tables.
While JMX access is here to stay, virtual tables provide an improved method of achieving metric monitoring and other tasks without the added configuration burden.
Last, but never least, community
No list of Apache Cassandra features is complete without mentioning the vast and dedicated community that has made Cassandra 4.0 what it is, and continues to advance its capabilities. Cassandra’s community is its greatest asset, and with the Cassandra 4.0 beta, this community has demonstrated and absolutely proven the validity of the open source model as practiced by the Apache Foundation.