Sora, and the cost of real things.
In March, OpenAI abruptly announced the end of its Sora app and the underlying image-generation model. The most explicit reason for Sora’s discontinuation seems to be a refocusing on enterprise productivity applications, seeking some of the success that rival Anthropic has seen with Claude Code. Astute observers (including yours truly in my first Overmorrow essay last fall) also point to a more immediate concern—namely, that Sora burns piles of cash. OpenAI’s announcement on X, née Twitter, was silent on the reasons for the service’s demise, but it did say something else: “What you made with Sora mattered, and we know this news is disappointing.”
What you made with Sora mattered.
Did it?
Personally, I have never seen anything I knew to be an AI-generated video that I was particularly glad to have seen. I’ve seen very convincing videos and disturbingly “off” videos and everything in between, but none that have made me laugh, stirred my feelings, or edified me the way that human-generated videos have.
Now, to be fair to generative AI videos and their enthusiasts, this is also not a medium I spend a ton of time on. I loathe Instagram and I was never more than the most casual TikTok consumer. While I do use YouTube, suffice it to say it’s not the first place I turn to for information or entertainment. (As I once quipped on Mastodon, “You really think you can keep a secret from me? You’ll have to bury it deep. Hide it somewhere I’d never look, somewhere I’d never even think to look. Like in a how-to video on YouTube.”) Sora was not destined to become part of my media diet.
But AI-generated media is hard to escape these days, so I am familiar with the state of the art. But while it has become ever more technically impressive and often convincingly realistic, I still have yet to see a single example that I thought justified even the effort of the user’s prompt.
Did you see the Pepperoni Hug Spot ad a few years back? This was one of the first fully AI-generated videos I became aware of, and definitely falls on the “cursed” end of the spectrum. Watching the short 2023 clip now, it seems like a strong candidate as the source of the term “AI slop.” Everything and everyone in it is recognizable, but deeply wrong—vignette after vignette of distorted faces, improbable movements, and errors of spatial reasoning. Shots of people eating pizza confuse the relationship between faces, mouths, and the food. I don’t know if the script was AI-generated or not; it’s almost too whimsically wrong. When the narrator lists ingredients such as “cheese, pepperoni, vegetable, and more secret things” it evokes the Dadaist scripts from Keaton Patti’s whimsical “I forced a bot to watch 1,000 hours of ____ and asked it to write…” jokes on Twitter in the pre-LLM-boom days.
Video-generation models have come a long way in three years; a lot of AI slop video in 2026 doesn’t look anything like Pepperoni Hug Spot. But I can’t shake the feeling I’m still watching that nightmarish ad when I see one.
My main problem with Pepperoni Hug Spot (quality aside) is that it is fake in three ways. First, obviously there is no Pepperoni Hug Spot pizza franchise you can call for delivery. There’s no restaurant, there are no customers, there’s no pizza being made or sold. By itself, that would be fine—there are such things as fiction, parody, or dramatization.
Second, though, through the magic of generative AI, there’s also neither a shooting location nor a green screen against which these people were filmed while eating. Because they aren’t people. And there’s nothing for them to eat. No narrator voiced the voiceover. We aren’t seeing an SNL skit about a pizza place. We aren’t seeing anything—not even digitally-drawn fakery or a Hollywood set.
Again, though, this is art. Or it would be, except there is no artist. And that leads to the third and final way Pepperoni Hug Spot is fake: there’s also no real human effort to speak of. Someone chose models and wrote prompts and vetted the output, sure, but the many hours of planning, the equipment, the talent and labor that would have gone into producing a 30-second pizza commercial are also missing.
And that is precisely why even the slicker, modern AI-generated videos leave me so cold: no matter how realistic they become, they are so fake they can’t even be called fake, because no one faked them. There’s nothing there. It’s a very convincing nothing, maybe. But it’s still a nothing.
How could the entire world implied by a Sora video be called “nothing?” Last year on Intelligent Machines, Cory Doctorow described how information becomes inflated by an AI model: Imagine that a student asks a faculty member for a letter of recommendation, and the faculty member takes three bullet points about the student and asks ChatGPT to draft a full letter from said bullet points. How much information is in the full-length letter of recommendation? At most, assuming nothing has been erased by the LLM’s flights of fancy, it’s only the original three bullet points! The model doesn’t know anything else about the student; everything else in the letter is either padding or a plausible but groundless guess.
The same is true of generated images and video. Everything you see, everything you hear, everything you glean from viewing it that wasn’t already in the prompt is something the author of the prompt does not actually know. The actual information, the words of the prompt, are stretched thin like a balloon, inflated by a cloud of statistically-likely pixels and sounds that approximate the model’s training data.
Before the generative AI boom, of course, it was possible for images and video to be “fake.” In fact, at a certain remove, they’re all fake, even the true ones. “Art is a lie that makes us realize truth,” as Picasso said (or, fitting for this point, did not say, at least not exactly). There are limitations to every medium and technology, plus deliberate choices of composition. And happenstance. Even the most faithful journalist makes editorial decisions. The map is not the territory; the art is not what it depicts.
The difference, though, is that even when what was depicted wasn’t faithfully reproduced—even when it wasn’t real at all—the entire work was still information. Someone made it that way. That bird either really was in the background of the shot, or someone drew it in. Either that woman’s skin was that clear, or someone airbrushed out the imperfections. Or someone invented her from whole cloth in the first place! At a minimum, when we saw a picture, a video, or a string of words, we knew one thing: whether what we are looking at was “real,” whether it was true, it came from someone. That what we saw was either the way it was, or the way someone made it. A given piece of media resulted from the application of skill and time and resources, from skin in the game.
I believe that is part of what has been, or at least might yet be, lost in the current moment. “Seeing is believing” has been a dubious standard for as long as there has been photography. Things could be “photoshopped” even before Photoshop (just ask Stalin). But pictures and videos still held a certain weight, because they either reflected reality, or a real cost to fake.
That, of course, evokes the topic of misinformation, and related horrors like deepfakes. But those are worries for a different essay; I’ve rambled here for long enough. My point today is this: I’ve seen AI-generated videos reach a high level of quality. I’ve never seen an AI video above a very low level of value. I feel in my bones what Cory Doctorow meant about an entire generated letter containing only three bullet points’ worth of information. The generated video never contains more information than the prompt. In the modern era, that means it also lacks the one important piece of information which could be inferred about every work that came before AI: that someone cared enough to spend their talents and resources on making it.
OpenAI was wrong. What users made with Sora didn’t matter, because it was a platform for making videos that no one could actually be bothered to make in the first place.