DataStax LangChain integration builds new link to gen-AI for developers

The open source Apache Cassandra database of course has given rise to DataStax, the company known for commercially supported services related to the database itself.

Now describing itself as the company that powers generative AI applications with real-time scalable data, DataStax keeps its roots in Apache Cassandra, but now works within a wider realm of related technologies, tools and computing methodologies.

This month sees the firm announce a new integration with LangChain, a popular orchestration framework for developing applications with Large Language Models (LLMs).

The integration is designed to make it easy to add DataStax’s Astra DB – a real-time database for developers building production generative AI (gen-AI) applications – or Apache Cassandra, as a new vector source in the LangChain framework.

Retrieval Augmented Generation

With the adoption of Retrieval Augmented Generation (RAG) on the rise – the process of providing context from outside data sources to deliver more accurate LLM query responses – into their generative AI applications, DataStax says that they require a vector store that gives them real-time updates with zero latency on critical, real-life production workloads.

Generative AI applications built with RAG stacks require a vector-enabled database and an orchestration framework like LangChain, to provide memory or context to LLMs for accurate and relevant answers. Developers use LangChain as the leading AI-first toolkit to connect their applications to different data sources.

The special sauce

In terms of the special sauce sweet spot here – this new integration lets developers use the Astra DB vector database for their LLM, AI assistant and real-time generative AI projects through the LangChain plugin architecture for vector stores.

“In a RAG application, the model receives supplementary data or context from various sources — most often a database that can store vectors,” said Harrison Chase, CEO, LangChain. “Building a generative AI app requires a robust, powerful database, and we ensure our users have access to the best options on the market via our simple plugin architecture. With integrations like DataStax’s LangChain connector, incorporating Astra DB or Apache Cassandra as a vector store becomes a seamless and intuitive process.”

Together, Astra DB and LangChain help developers to take advantage of framework features like vector similarity search, semantic caching, term-based search, LLM-response caching and data injection from Astra DB (or Cassandra) into prompt templates.

“Developers at startups and enterprises alike are using LangChain to build generative AI apps, so a deep native integration is a must-have,” said Ed Anuff, CPO, DataStax. “The ability for developers to easily use Astra DB as their vector database of choice, directly from LangChain, streamlines the process of building the personalised AI applications that companies need.”

Anuff concludes by saying that in fact, his team is already seeing users benefit from these joint technologies as healthcare AI company, Skypoint, is using Astra DB and LangChain to power its generative AI healthcare model.