Google Announces Gemini, Its Most Capable AI Model

Google today announced Gemini, which it describes as the most capable, flexible, and general AI model it’s ever built. Gemini comes in three sizes, Gemini Ultra, Pro, and Nano, and the latter will work on-device on the Pixel 8 Pro that the firm released back in October.

“Gemini is the first realization of the vision we had when we formed Google DeepMind earlier this year,” Google CEO Sundar Pichai said. “This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company.”

Windows Intelligence In Your Inbox

Sign up for our new free newsletter to get three time-saving tips each Friday — and get free copies of Paul Thurrott’s Windows 11 and Windows 10 Field Guides (normally $9.99) as a special welcome gift!

“*” indicates required fields

“Gemini was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video,” Google DeepMind CEO Demis Hassabis added. “It’s able to efficiently run on everything from data centers to mobile devices. Its state-of-the-art capabilities will significantly enhance the way developers and enterprise customers build and scale with AI.”

According to Google, Gemini is the first AI model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects that include math, physics, history, law, medicine, and ethics for testing both real-world knowledge and problem-solving abilities. And it provides head-to-head results against OpenAI’s ChatGPT 4, which it outperforms on general reasoning, reasoning, math, and code.

Gemini has also scored a “state-of-the-art score of 59.4 percent” on the new MMMU benchmark, which consists of multimodal tasks spanning different domains requiring deliberate reasoning such as extracting text from images without using OCR (object character recognition) assistance. Here, too, Google claims that Gemini outscores ChatGPT across various image, video, and audio tasks.

“Gemini 1.0’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information,” Hassabis said. This makes it uniquely skilled at uncovering knowledge that can be difficult to discern amid vast amounts of data … [It was also] trained to recognize and understand text, images, audio and more at the same time, so it better understands nuanced information and can answer questions relating to complicated topics. This makes it especially good at explaining reasoning in complex subjects like math and physics.”

For software developers, Gemini can understand, explain, and generate high-quality code in languages like Python, Java, C++, and Go. And it was trained on Google’s in-house Tensor Processing Units (TPUs) to be reliable, efficient, and scalable. Google also built a next-generation Cloud TPU v5p specifically to train cutting-edge AI models and accelerate the development of Gemini.

Gemini 1.0 is now rolling out across a variety of Google products and services. Starting today, Bard gains access to a tuned version of Gemini for more advanced reasoning, planning, understanding and more. (It’s in English only right now but is available in over 170 countries.) It’s being used with Google’s Search Generative Experience (SGE), providing a 40 percent latency reduction in latency in English in the U.S., alongside other quality improvements. And Gemini Nano is coming to the Pixel 8 Pro today, where it will power the new Summarize feature in the Recorder app and Smart Reply in Gboard, starting with WhatsApp (with more messaging apps coming next year).

Gemini will be added to Search, Ads, Chrome, and Duet AI in 2024, Google says. And Gemini Ultra is coming to select customers, developers, partners, and safety and responsibility experts for experimentation and feedback before it rolls out to developers and enterprise customers early next year.

But developers interested in getting started now can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI starting December 13. Developers interested in the on-device Gemini Nano will be able to use AICore in early preview on Android 14, starting with Pixel 8 Pro devices.

You can learn more on the Gemini website.