RAG with databases on Google Cloud

Trained on an enormous corpus of publicly available data from a broad range of topics, large language models (LLMs) are powerful in many ways but can be improved in other areas.

Due to the size of the training data, it can be resource-intensive to train them frequently. As a result, they may not have the most up-to-date information. Moreover, because they are trained on available data, anything behind a corporate firewall is unknown to them. Ask an LLM who won the latest sports game or what the premium is for your health insurance, and it will likely not know the answer. These limitations may be fine for general knowledge questions, but enterprises are looking to leverage LLMs to create generative AI apps that offer high accuracy, can access real-time information, and support complex conversational experiences.

An increasingly popular approach to this problem is to “ground” LLMs by utilizing a technique called Retrieval Augmented Generation (RAG). This opens up new opportunities for enterprises to build gen AI apps that can leverage fresh or proprietary data by enriching LLM prompts to deliver relevant and accurate information. This is especially crucial for companies and industries that are bound by regulations on sensitive information.

The RAG approach

Let’s take a look at how RAG works, using a customer service chatbot example that can answer a wide range of questions including availability, pricing, and return policies. If you asked a typical LLM a generic question such as “what are some popular toys for kids under 5 years old?” it would likely be able to respond with an answer — but since the LLM has no idea about current inventory in stores, the answer is not going to be relevant for shoppers. To make the customer support chatbot use the latest data and policies for the answers, the RAG approach may prove to be effective.

Composed of a pre-step and four steps, this simplified RAG example flows through the process of how an app can provide grounded answers by utilizing the similarity search feature of a database that supports vector indexing.

RAG with databases on Google Cloud

The RAG approach

Character.AI Promises Changes After Revelations of Pedophile and Suicide Bots on Its Service

OpenAI Alarmed When Its Shiny New AI Model Isn’t as Smart as It Was Supposed to Be

How AI Can Reduce Development Time and Costs for Software Projects – AI Time Journal

OpenAI Reportedly Hitting Law of Diminishing Returns as It Pours Computing Resources Into AI

AIPressRoom Exclusive | Assaf on Revolutionizing Research with GPT Researcher

AIPressRoom Exclusive | Thomas Bradley on Enhancing the Home Cooking Experience with Drizzlelemons

AIPressRoom Exclusive | How Butternut AI is Transforming the Future of Website Creation

AIPressRoom Exclusive | Hermann on Transforming Tech Funding with PitchMastr

New MIT Model Predicts Transition States With Unprecedented Speed

The RAG approach

Log In

With social network:

Or with username:

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections