Physical Intelligence, a pioneering startup focused on creating practical artificial intelligence models to serve as “brains” for robots, announced today that it has raised $400 million in new funding. The round was led by Jeff Bezos along with Thrive Capital and Lux Capital. Other notable investors included leading AI firm OpenAI, Redpoint Ventures, and Bond.
Co-founder and Chief Executive Karol Hausman, a former robotics scientist at Google, leads a team of researchers from prestigious institutions like the University of California at Berkeley and Stanford University. Physical Intelligence aims to build a universal AI model for robots—an artificial physical intelligence—that can understand and interact with the physical world, enabling robots to perform complex, multi-step tasks.
At the heart of Physical Intelligence’s innovation is their AI model named π0, or pi-zero—a general-purpose robot foundation model. Unlike traditional robots that are narrow specialists performing repetitive motions in controlled environments, π0 is designed to enable robots to adapt to complex and unpredictable real-world settings, such as homes or dynamic workplaces.
Over the past eight months, the company has developed π0 by training it on the largest robot interaction dataset to date. This dataset includes both open-source information and a diverse set of dexterous tasks collected across eight distinct robots. Tasks range from folding laundry and assembling boxes to bussing tables and packing items, covering various aspects of robot dexterity and physical interaction.
“Our goal in selecting these tasks is not to solve any particular application, but to start to provide our model with a general understanding of physical interactions—an initial foundation for physical intelligence,” the company explained in a recent blog post.
π0 leverages internet-scale pretraining by starting from a pre-trained vision-language model (VLM). VLMs are trained on vast amounts of text and images from the web, providing rich semantic knowledge and visual understanding. To enable high-frequency dexterous control required for robot manipulation, Physical Intelligence developed a novel method to augment pre-trained VLMs with continuous action outputs via flow matching, a variant of diffusion models.
Using π0, Physical Intelligence has showcased the model’s ability to enable robots to perform tasks that were previously challenging for robotic systems:
- Folding Laundry: The robot can unload a dryer, bring clothes to a table, and fold them into neat stacks autonomously. This task requires the robot to handle deformable objects and adjust its actions based on the clothing’s shape and position.
- Bussing Tables: The robot distinguishes between dishes and trash, placing items appropriately while employing emergent strategies like shaking trash off plates before stacking them. It can adapt to various objects and scenarios without explicit programming for each variation.
- Assembling Boxes: The robot takes a flattened cardboard box and constructs it by folding sides and tucking in flaps. This task demands precise manipulation and the ability to adjust actions if the box doesn’t fold as expected.
One of the significant challenges in creating a generalist robot model like π0 is the lack of large-scale, multitask, multi-robot data. Physical Intelligence addresses this by combining diverse datasets and leveraging pre-trained VLMs to imbue the model with broad knowledge.
The company compared π0 to other robot foundation models such as OpenVLA, a 7-billion-parameter open-source model, and Octo, a 93-million-parameter model. π0 outperformed these models on complex tasks, attributed to its large-scale multi-task data and novel architecture.
To achieve their vision, the company emphasizes the need for not just more data but also collaboration across the robotics community. They have ongoing partnerships with companies and robotics labs to refine hardware designs and incorporate partner data into their pre-trained models.
Bringing “brains” to robots is becoming a significant trend in the tech industry. Last year, researchers from Google unveiled a robot using PaLM-E, a 562-billion-parameter model capable of understanding basic voice commands like picking up and delivering objects. Nvidia also announced Project GR00T, a general-purpose foundation model for bipedal humanoid robots earlier this year.
Other companies are emerging as key players in this space. For instance, Vayu Robotics is considered a top contender, with Geoffrey Hinton serving as an advisory board member and their CTO being his former Ph.D. student. Their CEO, formerly the CEO of Velodyne, brings substantial industry experience to the company. This highlights the growing competition and innovation in developing advanced robotic “brains” capable of complex tasks.
The latest injection of capital values the company at approximately $2.4 billion post-money and follows a $70 million seed round in March led by Thrive Capital.