Ray on Vertex AI | Google Cloud Blog

Developers and engineers face several major challenges when scaling AI/ML workloads. One challenge is getting access to the AI infrastructure they need. AI/ML workloads require a significant amount of computational resources, such as CPUs and GPUs. Developers need to have sufficient resources to run their workloads. Another challenge is handling the diverse patterns and programming interfaces required for effective AI/ML workload scaling. Developers may need to adapt their code to run efficiently on the specific infrastructure they have available. This can be a time-consuming and complex task.

To address these challenges, Ray provides a comprehensive and easy-to-use Python distributed framework. With Ray, you configure a scalable cluster of computational resources and utilize a collection of domain-specific libraries to efficiently distribute common AI/ML tasks like training, serving, and tuning.

Today, we are thrilled to announce our seamless integration of Ray, a powerful distributed Python framework, with Google Cloud’s Vertex AI is generally available. This integration empowers AI developers to effortlessly scale their AI workloads on Vertex AI’s versatile infrastructure, which unlocks the full potential of machine learning, data processing, and distributed computing.

Why Ray on Vertex AI?

Accelerated and Scalable AI Development: Ray’s distributed computing framework provides a unified experience for both generative AI and predictive AI, which seamlessly integrates with Vertex AI’s infrastructure services. Scale your Python-based machine learning, deep learning, reinforcement learning, data processing, and scientific computing workloads from a single machine to a massive cluster, so you can tackle even the most demanding AI challenges without the complexity of managing the underlying infrastructure.

Unified Development Experience: Integrating Ray’s ergonomic API with Vertex AI SDK for Python, AI developers can now seamlessly transition from interactive prototyping on their local development environment or in Vertex AI Colab Enterprise to production deployment on Vertex AI’s managed infrastructure with minimal code changes.

Enterprise-Grade Security: Vertex AI’s robust security features, including VPC Service Controls, Private Service Connect, and Customer-Managed Encryption Keys (CMEK), can help safeguard your sensitive data and models while leveraging the power of Ray’s distributed computing capabilities. Vertex AI’s comprehensive security framework can help ensure that your Ray applications comply with strict enterprise security requirements.

Get started with Ray and Vertex AI

Let’s assume that you want to tune a small language model (SML) such as Llama or Gemma. To fine-tune Gemma using Ray on Vertex AI, first you need a Ray cluster on Vertex AI, which Ray on Vertex AI lets you create in just a few minutes, using either the console or the Vertex AI SDK for Python. You can monitor the cluster either by leveraging the integration with Google Cloud Logging or using the Ray Dashboard.