2024-05-13

Released: a RAGOps template you can grow and refine in production

A "growing RAG" template — Human in the loop, role-based data access control, Amazon Bedrock integration, and more. RAGOps automates field-driven knowledge management.

Bringing "field-ready" generative AI within reach

Across the enterprise, LLM-centered generative AI usage is advancing at impressive speed.

The challenge for DX teams today is to plan and build LLM-based services at a level of usefulness the field can actually adopt. That requires customization beyond prompt engineering — tuning to specific operational contexts. RAG (Retrieval-Augmented Generation), which flexibly pulls in business data to tune the LLM's output, has come into focus and adoption is accelerating in the enterprise.

Customers driving RAG adoption commonly tell us "RAG accuracy is poor (low answer quality)" and "the path to real business use feels long." To answer that, we are releasing a template that realizes "RAGOps" with a "growing RAG" concept.

With the RAGOps template, you can spin up RAG quickly, capture data from real business use, and lift accuracy effectively over the course of operations.

RAG retrieves related information — such as business data — from a database when using an LLM, and folds it into the generation process. This produces accurate responses that match the operational context. Using your in-house knowledge base, you can build a smart conversational AI specialized to the work. RAG stands for Retrieval-Augmented Generation.

Making RAG truly useful requires many capabilities and operational practices: data collection, data cleaning, data augmentation, user feedback collection, knowledge-base updates, security and privacy, and more. Producing outcomes from RAG means combining these elements appropriately and operating continuously to fit the work. We call this effective operation of RAG "RAGOps" (RAG Operations).

About the RAGOps template

The RAGOps template is a RAG application template on exaBase Studio with the mechanics to "grow and refine" in production. By including features that improve RAG answer quality during operation, it dramatically lowers the bar to going live.

You can try the template by simply copying and pasting on exaBase Studio, and each part inside it can be added or swapped to fit your operations and the services you use. This is enabled by exaBase Studio: stack quick wins from something you can already touch, then expand incrementally as outcomes accrue.

Architecture of the RAG template

The architecture above has three primary processing parts:

Communication Agent — the interface between users and each system. Returns answers, requests answer registration from operators, and registers high-rated answers in databases for reuse.
Dataset Management Service — creates and manages datasets per usage context, and stores them in the appropriate database. This enables data access control configuration and improves RAG retrieval quality.
Data Loader — fetches data from databases.

The Communication Agent has a unique twist: it acts like a concierge, leveraging past questions and answers to deliver efficient, high-quality responses.

RAGOps stores past user questions and answers in a cache DB based on user ratings.
For similar questions, it answers from the cache DB rather than calling the LLM-driven RAG path — improving cost efficiency and answer quality.
It extracts relevant business data, retrieves past operator answers, builds a prompt that yields effective results, and gets an answer from the LLM.
If the user is not satisfied, it routes to an operator.
The operator registers the appropriate answer in an answer DB; from then on, similar questions are answered from the answer DB.

What's special about the RAG template

FAQ bots sometimes fail to deliver answers users find satisfying. The most common cause is that RAG lacks the right data to generate the answer. The RAGOps template captures user feedback when an answer is unsatisfactory and automatically prompts the right person in charge to author a proper answer.

Feedback that accumulates from many users is powerful. If 100 users each give feedback once, that's 100 instances of data improvement — and that improvement raises the system's value for every individual user.

Improving an AI service's accuracy through human feedback is generally called "Human in the loop." Humans review the AI's output and indicate improvements, raising the AI's performance. In a RAG system, that maps to operators correcting answers in production based on user satisfaction. Through this human involvement, answer accuracy improves.

There are situations where in-house data access must be controlled by user role and permission — RAG faces the same challenge. For example, when building a Q&A system on top of HR review manuals, there is one manual for evaluators (managers) and another for evaluees (employees). You want a single system, but the manager-side manual must be accessible only to managers and HR.

For such cases, the RAGOps template enforces separation of accessible databases based on user permission settings, generating answers without including information the user is not authorized to see.

You can freely choose your LLM with RAGOps. Combined with Amazon Bedrock, exaBase Studio enables secure answer generation that stays within the AWS network. Through Amazon Bedrock, you can use LLMs offered by AI21 Labs, Anthropic, Cohere, Meta, Stability AI, Amazon, and more via API. It supports multimodal data types — video and images, not just text — broadening the range of business use cases.

Related press release: "ExaWizards' RAGOps now supports AWS' generative AI service Amazon Bedrock" (2024-05-14).

Toward field-driven, automated knowledge management

What we want to achieve with customers through the RAGOps template is automated knowledge management that genuinely helps end users.

The mechanism implemented here is not just a system users can ask — it is also a system that asks users itself. Triggered by user feedback, the AI goes out to gather information and expand its database to be more useful. This collects collective knowledge automatically while building an AI system centered on the field's end users.

In one demo, we ask RAG about the reasons a sale was lost. Initially the answer is poor — the loss reasons are not registered in the knowledge base. After feedback to RAG, the Communication Agent goes to the responsible person to ask about the reasons.

RAGOps has very high potential as a knowledge management tool that lifts business productivity in the AI era.