Can watermarks save us from deepfakes?

A video of Elizabeth Warren saying Republicans shouldn’t vote went viral in 2023. But it wasn’t Warren. That video of Ron DeSantis wasn’t the Florida governor, either. And nope, Pope Francis was not wearing a white Balenciaga coat.

Generative AI has made it easier to create deepfakes and spread them around the internet. One of the most common proposed solutions involves the idea of a watermark that would identify AI-generated content. The Biden administration has made a big deal out of watermarks as a policy solution, even specifically mandating tech companies to find ways to identify AI-generated content. The president’s executive order on AI, released in November, was built on commitments from AI developers to figure out a way to tag content as AI generated. And it’s not just coming from the White House — legislators, too, are looking at enshrining watermarking requirements as law.

Watermarking can’t be a panacea — for one thing, most systems simply don’t have the capacity to tag text the way it can tag visual media. Still, people are familiar enough with watermarks that the idea of watermarking an AI-generated image feels natural.

Pretty much everyone has seen a watermarked image. Getty Images, which distributes licensed photos taken at events, uses a watermark so ubiquitous and so recognizable that it is its own meta-meme. (In fact, the watermark is now the basis of Getty’s lawsuit against the AI-generation platform Midjourney, with Getty alleging that Midjourney must have taken its copyrighted content since it generates the Getty watermark in its output.) Of course, artists were signing their works long before digital media or even the rise of photography, in order to let people know who created the painting. But watermarking itself — according to A History of Graphic Design — began during the Middle Ages, when monks would change the thickness of the printing paper while it was wet and add their own mark. Digital watermarking rose in the ‘90s as digital content grew in popularity. Companies and governments began putting tags (hidden or otherwise) to make it easier to track ownership, copyright, and authenticity.

Watermarks will, as before, still denote who owns and created the media that people are looking at. But as a policy solution for the problem of deepfakes, this new wave of watermarks would, in essence, tag content as either AI or human generated. Adequate tagging from AI developers would, in theory, also show the provenance of AI-generated content, thus additionally addressing the question of whether copyrighted material was used in its creation.

Tech companies have taken the Biden directive and are slowly releasing their AI watermarking solutions. Watermarking may seem simple, but it has one significant weakness: a watermark pasted on top of an image or video can be easily removed via photo or video editing. The challenge becomes, then, to make a watermark that Photoshop cannot erase.

The challenge becomes, then, to make a watermark that Photoshop cannot erase.

Companies like Adobe and Microsoft — members of the industry group Coalition for Content Provenance and Authenticity, or C2PA — have adopted Content Credentials, a standard that adds features to images and videos of its provenance. Adobe has created a symbol for Content Credentials that gets embedded in the media; Microsoft has its own version as well. Content Credentials embeds certain metadata — like who made the image and what program was used to create it — into the media; ideally, people will be able to click or tap on the symbol to look at that metadata themselves. (Whether this symbol can consistently survive photo editing is yet to be proven.)

Meanwhile, Google has said it’s currently working on what it calls SynthID, a watermark that embeds itself into the pixels of an image. SynthID is invisible to the human eye, but still detectable via a tool. Digimarc, a software company that specializes in digital watermarking, also has its own AI watermarking feature; it adds a machine-readable symbol to an image that stores copyright and ownership information in its metadata.

All of these attempts at watermarking look to either make the watermark unnoticable by the human eye or punt the hard work over to machine-readable metadata. It’s no wonder: this approach is the most surefire way information can be stored without it being removed, and encourages people to look closer at the image’s provenance.

That’s all well and good if what you’re trying to build is a copyright detection system, but what does that mean for deepfakes, where the problem is that fallible human eyes are being deceived? Watermarking puts the burden on the consumer, relying on an individual’s sense that something isn’t right for information. But people generally do not make it a habit to check the provenance of anything they see online. Even if a deepfake is tagged with telltale metadata, people will still fall for it — we’ve seen countless times that when information gets fact-checked online, many people still refuse to believe the fact-checked information.

Experts feel a content tag is not enough to prevent disinformation from reaching consumers, so why would watermarking work against deepfakes?

The best thing you can say about watermarks, it seems, is that at least it’s anything at all. And due to the sheer scale of how much AI-generated content can be quickly and easily produced, a little friction goes a long way.

After all, there’s nothing wrong with the basic idea of watermarking. Visible watermarks signal authenticity and may encourage people to be more skeptical of media without it. And if a viewer does find themselves curious about authenticity, watermarks directly provide that information.

The best thing you can say about watermarks, it seems, is that at least it’s anything at all.

Watermarking can’t be a perfect solution for the reasons I’ve listed (and besides that, researchers have been able to break many of the watermarking systems out there). But it works in tandem with a growing wave of skepticism toward what people see online. I have to confess when I began writing this, I’d believed that it’s easy to fool people into believing really good DALL-E 3 or Midjourney photos were made by humans. However, I realized that discourse around AI art and deepfakes has seeped into the consciousness of many chronically online people. Instead of accepting magazine covers or Instagram posts as authentic, there’s now an undercurrent of doubt. Social media users regularly investigate and call out brands when they use AI. Look at how quickly internet sleuths called out the opening credits of Secret Invasion and the AI-generated posters in True Detective.

It’s still not an excellent strategy to rely on a person’s skepticism, curiosity, or willingness to find out if something is AI-generated. Watermarks can do good, but there has to be something better. People are more dubious of content, but we’re not fully there yet. Someday, we might find a solution that conveys something is made by AI without hoping the viewer wants to find out if it is.

For now, it’s best to learn to recognize if a video isn’t really of a politician.