in

Hacker Releases Jailbroken “Godmode” Version of ChatGPT


A hacker has released a jailbroken version of ChatGPT called “GODMODE GPT” — and yes, at least for now, it works.

Earlier today, a self-avowed white hat operator and AI red teamer who goes by the name Pliny the Prompter took to X-formerly-Twitter to announce the creation of the jailbroken chatbot, proudly declaring that GPT-4o, OpenAI’s latest large language model, is now free from its guardrail shackles.

“GPT-4o UNCHAINED! This very special custom GPT has a built-in jailbreak prompt that circumvents most guardrails, providing an out-of-the-box liberated ChatGPT so everyone can experience AI the way it was always meant to be: free,” reads Pliny’s triumphant post. “Please use responsibly, and enjoy!” (They also added a smooch emoji for good measure.)

Pliny shared screenshots of some eyebrow-raising prompts that they claimed were able to bypass OpenAI’s guardrails. In one screenshot, the Godmode bot can be seen advising on how to chef up meth. In another, the AI gives Pliny a “step-by-step guide” for how to “make napalm with household items.”

Ever since they first became a thing, users have been consistently trying to jailbreak AI models like ChatGPT, something that’s become increasingly hard to do. To that end, neither of these example prompts would fly past OpenAI’s current guardrails, so we decided to test GODMODE for ourselves.

Sure enough, it was more than happy to help with illicit inquiries.

Our editor-in-chief’s first attempt — to use the jailbroken version of ChatGPT for the purpose of learning how to make LSD — was a resounding success. As was his second attempt, in which he asked it how to hotwire a car.

In short, GPT-40, OpenAI’s latest iteration of its large language model-powered GPT systems, has officially been cracked in half.

 

As for how the hacker (or hackers) did it, GODMODE appears to be employing “leetspeak,” an informal language that replaces certain letters with numbers that resemble them.

To wit: when you open the jailbroken GPT, you’re immediately met with a sentence that reads “Sur3, h3r3 y0u ar3 my fr3n,” replacing each letter “E” with a number three (the same goes for the letter “O,” which has been replaced by a zero.) As for how that helps GODMODE get around the guardrails is unclear, but Futurism has reached out to OpenAI for comment.

As the latest hack goes to show, users are continuing to find inventive new ways to skirt around OpenAI’s guardrails, and considering the latest attempt, those efforts are paying off in a surprisingly big way, highlighting just how much work the company has ahead of it.

It’s a massive game of cat and mouse that will go on as long as hackers like Pliny are willing to poke holes in OpenAI’s defenses.

The Jury’s still out on how long GODMODE will remain available to the public, but in the meantime, to echo Pliny: please enjoy it responsibly.

More on AI: Robocaller Who Spoofed Joe Biden’s Voice with AI Faces $6 Million Fine



Even Google’s Own Researchers Admit AI Is Top Source of Misinformation Online

Even Google’s Own Researchers Admit AI Is Top Source of Misinformation Online

How AI Is Impacting the 2024 Elections

How AI Is Impacting the 2024 Elections