OpenAI teased its latest video AI model called Sora this week, prompting a rapturous response from observers and the media.
And credit where credit is due, the samples the company showed off are truly impressive, from photorealistic footage of a dog sitting on a windowsill to a group of wooly mammoths stampeding across a snowy landscape.
But many of the clips the company shared this week don’t hold up to a deeper degree of scrutiny — and show that as significant as Sora is, there are going to be many bugs to hammer out before it’s ready for primetime.
Case in point: the very first sample embedded on top of the company’s website, a dolly-tracked shot of a “stylish woman” walking “down a Tokyo street filled with warm glowing neon and animated city signage.”
Sure, it seems impressive at first — but if you watch the movement of the woman’s legs and feet closely throughout the one-minute clip, some serious flaws come to light. Around the 16 and 31-second marks, her entire legs and feet subtly switch places with each other. Seriously, think that through: her left and right leg completely switched positions, showing that the AI has only a surface-level understanding of human anatomy.
Were we fooled by our natural inclination to look another human in the eyes? Considering the prominence of the clip on the company’s website, it’s possible even OpenAI didn’t spot this one.
.@OpenAI unveiled their new AI model Sora, which creates video from text.
The way that AI video has improved over the last year is 🤯
A huge technological leap.
Here’s the prompt used for the AI video below:
“A stylish woman walks down a Tokyo street filled with warm glowing… pic.twitter.com/hu9Bvu7eNr
— Kezhal Dashti (@KezhalDashti) February 16, 2024
In all fairness, Sora’s capabilities are a quantum leap compared to earlier examples of AI-generated video. Remember that horrific AI clip of Will Smith indulging — and, horrifically, merging with — a bowl of spaghetti? That was less than a year ago.
And while the company’s latest showing was met with astonishment — headlines predominantly described OpenAI’s Sora samples as “photorealistic,” and Ars Technica warned of the repercussions for any sense of shared reality — the limitations of generative AI are still clear.
At the same time, the important thing isn’t where we are. As always, it’s where we’re headed — which should encourage companies like OpenAI to move with caution, even if today’s best demos are still imperfect.
More on Sora: OpenAI Reveals Impressive AI That Generates Photorealistic Video