Utilizing Cloud TPU Multislice to scale AI workloads

“Multislice coaching has been a game-changer. It is made it simple to scale our ML workloads past a single densely-interconnected slice utilizing data-center networking. JAX XLA made it simple to arrange and delivered excessive efficiency out-of-the-box.”—Myle Ott, Co-founder, Character AI

Multislice helps JAX and PyTorch frameworks. For quick out-of-the field efficiency, along with compiler help for all fashions, we offer MaxText and PAX for LLMs, as open-sourced and well-tested examples written in pure Python and JAX that can be utilized as starter code. PAX is a framework for coaching large-scale fashions that permits for superior and absolutely configurable experimentation and parallelization, and has demonstrated industry-leading MFU charges. MaxText is a extra minimal framework supposed for forking and adaptation. The one code change in comparison with single-slice code is the additional sharding dimension for DCN parallelism.

Excessive efficiency networking

Multislice helps AllReduce, Broadcast, Cut back, AllGather and ReduceScatter collective communication operations over Google’s Jupiter information heart community. As reported in August 2022, Jupiter reduces movement completion by 10%, improves throughput by 30%, makes use of 40% much less energy, incurs 30% much less capex prices, and delivers 50x much less downtime than earlier generations of the Google information heart community.³

Straightforward to handle

There are two choices to handle the Multislice job: utilizing Compute Engine Queued Useful resource CLIs and APIs or via Google Kubernetes Engine (GKE).

Particular choices permit for one-step deletion and creation of the gathering of slices. And, quick restoration means jobs are restarted shortly even when particular person slices are interrupted.

Dependable and fault tolerant

Your mannequin coaching jobs restart robotically from the earlier checkpoint even when particular person slices fail. Utilizing Multislice with GKE additional improves the failure restoration expertise — a single field-change within the yaml file implements computerized retry on encountering errors.

“Google Cloud’s TPU Multislice supplied important productiveness and effectivity positive aspects for us proper out-of-the-box, enabling us to scale our language mannequin coaching reliably. We suggest Multislice to anybody constructing massive generative language AI fashions.”—Emad Mostaque, CEO, Stability AI

Get began

Multislice was designed to allow environment friendly large-scale AI mannequin coaching. To scale AI workloads, {hardware} and software program should work in live performance. We’ve got saved AI improvement productiveness high of thoughts and are excited so that you can strive Multislice in preview on each Cloud TPU v4 in addition to on the newly-announced Cloud TPU v5e.

Please contact your Google Cloud account consultant to be taught extra and check out Cloud TPU with Multislice utilizing PAX and MaxText.

^{1. Google inside information as of August, 2023
2. Google inside information as of August, 2023
3. Google inside information as of August, 2023}