From Text to Vision to Voice Exploring Multimodality with Open AI: Romain Huet

The future we are building towards: featuring a demo of GPT4o Omnimodel Voice, ChatGPT Desktop, Sora, and Voice Engine all in one talk.

Recorded live in San Francisco at the AI Engineer World’s Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World’s Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Romain
Hello! I’m a software engineer based in San Francisco, I run Developer Relations at OpenAI.

Previously, I ran developer relations at Stripe. Prior to that I was a Senior Developer Advocate at Twitter and the first member of Twitter’s Developer Relations team outside the US. In 2014, I helped launch Fabric, our mobile developer platform, and Digits. In 2015, our developer tour has led me to meet thousands of developers and entrepreneurs in more than 30 cities around the world.

Prior to Twitter, I was Co-Founder & CTO of Jolicloud, whose free operating system was designed to work on low cost computers and connect them to the cloud. Joli OS was the first OS based on Linux, Chromium, and HTML5, paving the way towards a new generation of browser-based platforms like Chrome OS. In 2010, the Jolibook was a finalist for “Netbook of the Year” at Engadget Awards, alongside Google’s first Chromebook.

From Text to Vision to Voice: Exploring Multimodality with OpenAI: Romain Huet

From Text to Vision to Voice Exploring Multimodality with Open AI: Romain Huet

Google DeepMind is making its AI text watermark open source

Microsoft’s Copilot AI Gets a Voice, Vision, and a ‘Hype Man’ Persona

OpenAI Voice Mode goes WILD | AI Vision wars HEAT up | RunWay GEN 3 produces SORA level videos

GPT-4o delivers human-like AI interaction with text, audio, and vision integration

Open AI Releases the BEST AI Video Generator BY FAR. Sora Text to Video

This German nonprofit is building an open voice assistant that anyone can use

Build an AI Chatbot With NestJS & Next.js | OpenAI Full Stack

OpenAI o1 for trading bots is unfair

How to build an app with OpenAI and Blazor

OpenAI Swarm Agents: Detailed Tutorial & Code Walkthrough

AIPressRoom Exclusive | Assaf on Revolutionizing Research with GPT Researcher

AIPressRoom Exclusive | Thomas Bradley on Enhancing the Home Cooking Experience with Drizzlelemons

AIPressRoom Exclusive | Hermann on Transforming Tech Funding with PitchMastr

AIPressRoom Exclusive | David Smith on Transforming Digital Documentation with EasyFill.ai

AI Face Swap Online (No Sign Up, Free) (aifaceswapper.io)

JPMorgan introduces in-house AI chatbot for research analysis

Prerna Asthana, Data Science at Meta – Revolutionizing Business Communication: Utilizing Advanced Machine Learning to Enhance Efficiency, Security, and User Experience – AI Time Journal

Log In

With social network:

Or with username:

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections