OpenAI is launching wider availability of its latest text-to-image generator. On Thursday, the company is giving ChatGPT Plus and Enterprise customers access to the new DALL-E 3 model that works within the ChatGPT app. OpenAI says it has prepared a safety mitigation stack for the model that makes it ready for an expanded release.
DALL-E 3 was first announced last month, and OpenAI showed how it improved upon the previous DALL-E 2 by allowing users to leverage ChatGPT to write longer and more visually descriptive prompts for them to feed the image generator. DALL-E 3 was added to Bing Chat and Bing Image Generator, making Microsoft’s platform the first to introduce wider public access to the model — even before ChatGPT.
The advertised guardrails to mitigate harmful imagery haven’t always worked, with users generating images of the World Trade Center as SpongeBob SquarePants and other characters pilot planes toward the buildings. Even after Microsoft blocked certain prompts, other simple workarounds produced similar results.
Text-to-image generators like Midjourney, Stable Diffusion, and older DALL-E iterations have all had their fair share of controversy. The tech has outputted copyright image materials, nonconsensual nudes, shifted ethnicity of subjects, and photo-realistic misrepresentations of public figures.
OpenAI is promising it’s taken much more extensive steps this time around and is providing a website that shows the research put into DALL-E 3. The company says it will “limit the model’s likelihood of generating content in the style of living artists, images of public figures, and to improve demographic representation across generated images.” OpenAI also has an internal “provenance classifier” tool that it says is capable of 99 percent accuracy in detecting if an image was generated by DALL-E 3.