“The early results of Imagen 3 models have pleasantly surprised us with its quality and speed in our testing,” said Gaurav Sharma, Head of AI Research, Typeface, a startup that specializes in leveraging generative AI for enterprise content creation. “It brings improvements in generating details, as well as lifestyle images of humans. As early partners of Google’s foundation models, we are looking forward to exploring the new Imagen and Gemini models further on the journey ahead together.”
“We make it easy for our users to turn their ideas into eye-catching presentations, websites, and other visual documents generated with the power of AI. To enable even greater personalization and creativity while reducing manual tasks, we offer the high-quality text-to-image capabilities of Imagen,” said Jon Noronha, Co-Founder, Gamma. “Our users have already generated over 4 million images with Imagen, and we’re excited about how Imagen 3 will enable them to create images even faster, include text in images, and safely improve the generation of photorealistic images with people.”
“Since adding Imagen to our AI image generator, our users have generated millions of pictures with the model. We’re excited by the enhancements Imagen 3 promises as it enables our users to execute their ideas faster without sacrificing quality. As an important enhancement to Shutterstock’s launch of the first ethically-sourced AI image generator, we also appreciate how safety is built in and that the content that is created is protected under Google Cloud’s indemnification for generative AI,” said Justin Hiza, VP of Data Services, Shutterstock.
Customers can click here to apply for access to Imagen 3 on Vertex AI.
Third-party and open models: Delivering expanded model choice with Vertex AI
At Google Cloud, we’re committed to empowering customer choice and innovation through our curated collection of first-party, open, and third-party models available on Vertex AI. That’s why we’re thrilled that we recently added Anthropic’s newly released model, Claude 3.5 Sonnet, to Vertex AI. Customers can begin experimenting with or deploying in production Claude 3.5 Sonnet on Google Cloud. Later this summer, we’ll be deepening our partnership with Mistral with the addition of Mistral Small, Mistral Large, and Mistral Codestral to Vertex AI Model Garden.
Continuing our push to meet customers where they are, earlier this year, we introduced Gemma, a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. We’re officially releasing Gemma 2 to researchers and developers globally. Available in both 9-billion (9B) and 27-billion (27B) parameter sizes, Gemma 2 is much more powerful and efficient than the first generation, with significant safety advancements built in. Starting next month, customers will be able to access Gemma 2 on Vertex AI.
Lower costs: Context caching for both Gemini 1.5 Pro and Flash
To help our customers efficiently take advantage of Gemini’s vast context windows, starting today, we are rolling out context caching in public preview for both 1.5 Pro and Flash. As context length increases, it can be expensive and slow to get responses for long-context applications, making it difficult to deploy to production. Vertex AI context caching helps customers significantly reduce input costs, by 75 percent, leveraging cached data of frequently-used context. Today, Google is the only provider to offer a context caching API.
Predictable performance: Provisioned throughput for Gemini models
Generally available today, with allowlist, provisioned throughput lets customers responsibly scale their usage of Google’s first-party models, like 1.5 Flash, providing assurances for both capacity and price. This Vertex AI feature brings predictability and reliability to customer production workloads, giving them the assurance required to scale gen AI workloads aggressively.
Delivering enterprise truth: Grounding with Google Search and now, grounding with third-party data
Enterprise readiness requires more than the model. Enterprises need to maximize factuality and drastically minimize hallucinations, which means grounding model output in web, first-party, and third-party truth and data, while meeting stringent enterprise-readiness standards, such as data governance and sovereignty.
At Google I/O, we announced the general availability of Grounding with Google Search in Vertex AI. With the service now generally available, businesses of all kinds can augment Gemini outputs with Google Search grounding, giving the models access to fresh and high-quality information. Customers can easily integrate the enhanced Gemini models into their AI agents.
“Gemini 1.5 Flash creates opportunities to better manage ROI moving forward. With the ability to ground model responses in Google Search, we can better increase the relevancy of results of our conversational experience, Ipsos Facto, with fresh data,” said JC Escalante of Ipsos. “This capability is a key component in our efforts to improve output quality and researcher experience.”
“Grounding with Google Search translates into more accurate, up-to-date, and trustworthy answers,” said Spencer Chan, Product Lead at Quora, which offers Grounding with Google Search on its Poe platform. “We’ve been delighted with the positive feedback so far, as users are now able to interact with Gemini bots with even greater confidence.”
Customers can click here to get started with Grounding with Google Search.
Additionally, today we are announcing that starting next quarter, Vertex AI will offer a new service that will enable customers to ground their AI agents with specialized third-party data. This will help enterprises integrate third-party data into their generative AI agents to unlock unique use cases and drive greater enterprise truth across their AI experiences. We are working with premier providers such as Moody’s, MSCI, Thomson Reuters, and Zoominfo to bring their data to this service.
“Google Cloud’s third-party data grounding offerings will open up new applications for KPMG and our clients,” said Brad Brown, Global Tax & Legal CTO at KPMG. “By seamlessly integrating specialized third-party data from industry leaders into our generative AI offerings, we can reduce time to insight, drive more informed decision-making, and ultimately deliver greater value using highly trustworthy data sources.”
To learn more about grounding, click here for a deeper dive.
More factual responses: Grounding with high-fidelity mode
In data-intensive industries like financial services, healthcare, and insurance, generative AI use cases often require the generated response to be sourced from only the provided context, not the model’s world knowledge. Grounding with high-fidelity, announced in experimental preview, is purpose-built to support such grounding use cases, including summarization across multiple documents, data extraction against a set corpus of financial data, or processing across a predefined set of documents. High-fidelity mode is powered by a version of Gemini 1.5 Flash that’s been fine-tuned to only use customer-provided content to generate answers and ensures high levels of factuality in response.
Best options for data sovereignty: Data residency for data stored at-rest, limiting ML processing to region
Customers, especially those from regulated industries, demand control over where their data is stored and processed when using generative AI capabilities. To meet these data sovereignty requirements, we have data residency for data stored at-rest guarantees in 23 countries (of which 13 — Spain, Italy, Israel, Switzerland, Poland, Finland, Brazil, India, Taiwan, Hong Kong, Australia, KSA, Qatar — were added in 2024), with additional guarantees around limiting related ML processing to the US and EU. We are also working on expanding our ML processing commitments to eight more countries, starting with four countries in 2024.
Get started with Vertex AI today
As the customer stories we’ve shared today demonstrate, Vertex AI helps businesses turn the power of generative AI into tangible, transformative results. We look forward to continuing to bring innovations like Gemini 1.5 Flash and Grounding with Google Search to our customers, and to making Vertex AI the most enterprise-ready generative AI platform.
To get started with Gemini 1.5 Flash on Vertex AI, click here.
To learn more about how Vertex AI can help your organization, click here, and to learn more about how Google Cloud customers are innovating with generative AI, read How 7 businesses are putting Google Cloud’s AI innovations to work.
1. Gartner, Magic Quadrant for Cloud AI Developer Services, Jim Scheibmeir, Arun Batchu, Mike Fang – April 29, 2024. Gartner, Gartner Magic Quadrant for Data Science and Machine Learning Platforms, Afraz Jaffri, Aura Popa, Peter Krensky, Jim Hare, Raghvender Bhati, Maryam Hassanlou and Tong Zhang – June 17, 2024. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner Inc. and/or its affiliates and are used herein with permission. All rights reserved. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose
2. Per study published by Gemini team, 14 June 2024 Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context