Using KubeRay and Kueue to orchestrate Ray applications in GKE.

See the Priority Scheduling with RayJob and Kueue guide for a full walk-through.

Gang scheduling

Scenario

Kueue’s all-or-nothing approach to workload admission ensures that RayJobs and RayClusters are scheduled only when all required resources are available. This significantly improves resource efficiency by preventing partially provisioned clusters that are unable to execute tasks. This strategy, often termed “gang scheduling,” is particularly valuable for the resource-intensive nature of AI/ML workloads.

Gang scheduling is important for use cases like data parallelism in distributed model training. Data parallelism shards data across multiple Pods, each running the same model. All gradients are sent to a parameter server, which updates the hyperparameters and then redistributes them to all Pods for the next iteration. If the RayJob or RayCluster is partially provisioned, the parameter server can’t update the hyperparameters and will become stuck until the custom resource becomes fully provisioned. This results in a total waste of resources. Gang scheduling can effectively avoid this situation.

How do Kueue and KubeRay implement gang scheduling?

You can take advantage of Kueue’s dynamic resource provisioning and queueing to orchestrate gang scheduling with KubeRay. This is essential when working with limited hardware accelerators like GPUs and TPUs. Kueue ensures Ray workloads execute only when all required resources are available, preventing wasted GPU/TPU cycles and maximizing utilization.

Kueue achieves this efficient gang scheduling on GKE using the ProvisioningRequest API. This API signals that a Ray workload should wait until the necessary compute nodes can be provisioned simultaneously. GKE’s cluster autoscaler accepts the ProvisioningRequest, scaling up nodes in one step, if and only if all required resources are available. Ray cluster Pods are then scheduled together on the newly provisioned nodes. Refer to How ProvisioningRequest Works for more details.

For a step-by-step demonstration, see the Gang Scheduling with RayJob and Kueue guide.

Conclusion

KubeRay and Kueue offer powerful tools for managing and optimizing Ray applications within GKE. Priority scheduling helps you ensure your most important AI/ML tasks always get the resources they need. Gang scheduling helps you make the most of hardware accelerators, preventing wasted time and maximizing efficiency. Together, these techniques improve the performance and cost-effectiveness of your Ray applications on the cloud