Deploy LangChain on Cloud Run with LangServe

LangChain is a popular framework that makes it easy to build apps that use large language models (LLMs). LangChain recently introduced LangServe, a way to deploy any LangChain project as a REST API. LangServe supports deploying to both Cloud Run and Replit.

I asked Nuno Campos, one of the founding engineers at LangChain, why they chose Cloud Run. He said:

“We researched alternatives, and Cloud Run is the easiest and fastest way to get your app running in production.”

In this blog, I’ll show you how to get started with LangServe and deploy a template to Cloud Run that calls the VertexAI PaLM 2 for chat model.

Generative AI apps explained

Generative AI chatbots such as Google Bard are powered by large language models (LLMs). Generally speaking, you prompt an LLM with some text and it’ll complete the prompt. While you can describe an LLM as an advanced auto-complete, that’s an oversimplified way of thinking about it. LLMs can write code, rephrase text, generate recommendations, and solve simple logic problems.

You can also send prompts to an LLM from your code, which can be very useful once you start integrating with your own private data and APIs. Some popular use cases include:

Asking questions over your own data (including manuals, support cases, product data)
Interacting with APIs using natural language, letting the LLM make API calls for you
Summarizing documents
Data labeling or text extraction

Building these integrations often involve building pipelines (typically referred to as chains), starting with a prompt, and bringing your own data into the prompt. That’s where LangChain comes in. Approaching 70k stars on GitHub, LangChain is by far the most popular framework for building LLM-powered apps.

Build chains with LangChain

LangChain provides all the abstractions you need to start building an LLM app, and it comes with many components out of the box, including LLMs, document loaders, text embedding models, vector stores, agents and tools. I’m glad to see many Google products that have an integration with LangChain. Some highlights include Vertex AI Vector Search (previously known as Matching Engine), and hundreds of open source LLM models through Vertex AI Model Garden.

Here’s how you can use LangChain to call the VertexAI PaLM 2 for chat model and ask it to tell jokes about Chuck Norris: