in

Google announces the Cloud TPU v5p, its most powerful AI accelerator yet


Google today announced the launch of its new Gemini large language model (LLM) and with that, the company also launched its new Cloud TPU v5p, an updated version of its Cloud TPU v5e, which launched into general availability earlier this year. A v5p pod consists of a total of 8,960 chips and is backed by Google’s fastest interconnect yet, with up to 4,800 Gpbs per chip.

It’s no surprise that Google promises that these chips are significantly faster than the v4 TPUs. The team claims that the v5p features a 2x improvement in FLOPS and 3x improvement in high-bandwidth memory. That’s a bit like comparing the new Gemini model to the older OpenAI GPT 3.5 model, though. Google itself, after all, already moved the state of the art beyond the TPU v4. In many ways, though, v5e pods were a bit of a downgrade from the v4 pod, with only 256 v5e chips per pod versus 4096 in the v4 pods and a total of 197 TFLOPs 16-bit floating point performance per v5e chip versus 275 for the v4 chips. For the new v5p, Google promises up to 459 TFLOPs of 16-bit floating point performance, backed by the faster interconnect.

Image Credits: Google

Google says all of this means the TPU v5p can train a large language model like GPT3-175B 2.8 times faster than the TPU v4 — and do so more cost-effectively, too (though the TPU v5e, while slower, actually offers more relative performance per dollar than the v5p).

Image Credits: Google

In our early stage usage, Google DeepMind and Google Research have observed 2X speedups for LLM training workloads using TPU v5p chips compared to the performance on our TPU v4 generation,” writes Jeff Dean, chief scientist, Google DeepMind and Google Research. “The robust support for ML Frameworks (JAX, PyTorch, TensorFlow) and orchestration tools enables us to scale even more efficiently on v5p. With the 2nd generation of SparseCores we also see significant improvement in the performance of embeddings-heavy workloads. TPUs are vital to enabling our largest-scale research and engineering efforts on cutting edge models like Gemini.”

The new TPU v5p isn’t generally available yet, so developers will have to reach out to their Google account manager to get on the list.

Google's Gemini isn't the generative AI model we expected

Google’s Gemini isn’t the generative AI model we expected

How we run our in-house generative AI accelerator: Framework for ideation

Five-month-old Indian AI startup Sarvam scores $41 million funding