Back to Writing

Substance and substrate

If you’re concerned about the spread of AI Slop, pay more attention to how it spreads than to the slop itself.

AI Slop

The AI wave that reached popular awareness in early 2023 is called “generative AI.” Its twin poster children were ChatGPT, a model and text chatbot by OpenAI, and Midjourney, an image generation model developed by its eponymous company, both released in 2022. The technology has various particular names, such as “Large Language Model” (LLM) for text generation, or “Diffusion Model,” primarily for image generation. We laypeople collectively describe it as “GenAI.” To many of us, the quality of its outputs ain’t great. We may testify to the arrival of “AI Slop.”


AI Slop is a term that’s emerged in the past few years as a sort of spiritual descendent of the term “spam,” first used in 1993 to describe unwanted posts on Usenet. Slop can refer to GenAI images, generally, if the speaker simply regards all of that species as “unwanted.” More narrowly, Slop, like Spam, is low-quality, high-volume content resulting from a commercially-motivated numbers game. Alex Hern and Dan Milmo describe it in this 2024 Guardian piece:

[AI Slop is] what you get when you shove artificial intelligence-generated material up on the web for anyone to view… it functions mostly to create the appearance of human-made content, benefit from advertising revenue and steer search engine attention towards other sites.

In this meaning, Slop is a subset of Spam. Mitigating spam is an exercise in microeconomics: email providers and social media operators endeavor to raise the marginal cost of production and distribution that spam producers face. The advent of GenAI dramatically lowers the cost of production - but, so far, does not change the costs of distribution. How, then, does that slop heave down my trough?

Recommendation Systems

Since the early 2010s, the most successful tech companies, like Google and Facebook, have built increasingly sophisticated systems for ranking (recommending) content for users. Beginning with Facebook’s EdgeRank and Google’s PageRank, many platforms have built algorithms - in fact, complex collections of algorithms - to sequence information to consumers in some intentional way. Because these YouTubes and TikToks and Instagrams and Xitters are advertising businesses, they design their “value models” to maximize “engagement” - the quantity of attention users spend.

The value proposition of AdTech is that it can deliver paid content (ads) to viewers with a greater Return On Ad Spend (ROAS) for an advertiser than other channels that advertiser might have used. Accordingly, the recommendation systems AdTech companies build are coveted proprietary tech, with very little public information about their innards. As an exhibit, take a look at this sparse Facebook’s Help Center article about feed ranking.

Let’s grant as a Commandment of social media platform owners, “thou shalt maximize attention.” The consequences of this Prime Directive mean that the operators of Feeds are measuring attention somehow. In some cases they measure the literal milliseconds logged client-side by a user, as they hover on a piece of content in their phone’s display. At Meta, this metric is called Viewport Views (VPVs). The platforms also measure proxies - or predictors - of time spent, through engagement signals like the “like” button, comments, re-sharing, clicking on URLs, and so forth.

Advertisers–your familiar brands and your sketchier growth hackers alike–must navigate the incentive system of the Feed. How should I construct the content that I publish in order to get the largest amount of “engagement signals” that the platform has decided to reward with higher positions in the Feed, and thus more eyeballs?

There are two pieces to the answer: substance and substrate.

Substance

This is the “what” - the piece of content itself. What should I publish to maximize attention? The answer is old: “if it bleeds, it leads.” Sensationalism has been a reliable tool in the belt of the publisher for centuries. In 18th century London, “scandal sheets” like The Morning Post published salacious (regularly false) celebrity gossip. In 19th century U.S., William Randolph Hearst and Joseph Pulitzer employed exaggeration and illustration–”yellow journalism”–on subjects like crime, scandal and violence in efforts to outpace each other’s news circulation. As radio took off, shock jocks from Father Charles Coughlin to Rush Limbaugh enraptured their audiences with vitriol. The age of the 24-Hour News Cycle followed, with outlets digging up scarier and ickier stories to keep their viewers rapt. Take it from CBS CEO Leslie Moonves, who said of Donald Trump’s incendiary 2016 presidential candidacy “It May Not Be Good for America, but It’s Damn Good for CBS.”

It’s not all gloom and doom though - how about a palette cleanser of some “soft news” with some human interest stories? Cue the cute cat content. Or film yourself getting ice dumped on you for a good cause, and tag some friends!

This speedrun of attention-grabbing media suggests a sort of “ABCDs” of attractive content: Animals, Babes, Challenges, and Disasters. But why are these true of social media?

Substrate

This is the “Who, When, Where, How” - the distribution system. The “social” part of social media is about the network, also called the social graph, and it is the most valuable asset controlled by a platform. Just like the proprietary feed ranking logic, social graphs are very difficult to observe with any specificity and completeness because they constitute all of the pathways - or “edges” - by which content originates from and arrives at different publishers or users - “nodes.” Platforms claim that third parties systematically gathering information about the network violates their Terms of Service, as NYU misinformation researchers learned of Meta in 2021.


In the absence of reliable third party research into how social media networks are shaped, and how content moves through it, we rely on platforms’ own remarks on the topic. Take Meta’s Widely Viewed Content Report for the Facebook product, for example. Here’s their “where posts in Feed come from” graph from Q2 2025.

This is a representation of the inventory - the set of all content that may appear in a user’s feed - but it says nothing about ranking - the logic used for prioritizing content in the feed. Notably, about a third of content in an average user’s feed inventory is from “unconnected” sources, which is to say the entire universe of content on Facebook.

In an illuminating 2018 post by Mark Zuckerberg (covered by TechCrunch here), Zuckerberg describes how sensationalist content drives more engagement:

“...when left unchecked, people will engage disproportionately with more sensationalist and provocative content… Our research suggests that no matter where we draw the lines for what is allowed, as a piece of content gets close to that line, people will engage with it more on average  -- even when they tell us afterwards they don't like the content.”

Several years later, Meta published its Content Distribution Guidelines that describes the qualities of posts that will receive “demotions” in feed (e.g. clickbait).

The positive feedback loop between “more sensationalist → more attention” may originate in human psychology, but it is enabled or curbed by feed ranking decisions platforms make.

Character & Content, Substance & Substrate

Together, the shape of a network and the logic of ranking in recommendation systems constitute what I’m calling the substrate. A given piece of content is the substance. The substrate and the substance interact, but it is the substrate that ultimately determines what content a user will see.

This is another reflection of what Marshall McLuhan described as the content and the character of media. In his formulation, content is what a person may superficially apprehend a piece of media to be: “a person in a bikini.” The character, then, is the meaning and consequence that media portends, in some social context: “I find this sexy and it feels good to look at.” Character is a more abstract set of attributes underlying the content of a piece of media.

Here’s how McLuhan’s content and character relate to the substance and substrate of social media distribution.

  • Content = Substance (these are synonymous)

  • Character

    • abstracted attributes or categorizations of the substance (ABCDs)

    • the substance’s propensity for engagement, such as the probability a viewer will click the “like” icon on it - stylized as p(like) ← this is the consequence of feed ranking model design, a part of substrate

  • Substrate

    • the shape of the social graph: edges and nodes

    • features of the feed ranking models: what the business chooses to “uprank” to maximize attention

We can compose the the resulting user experience as:

Substrate + Character = Content you see

McLuhan develops these content and character ideas as his seminal argument, “the medium is the message.”

Sometimes non-sensational content goes viral. The curious episode of the Instagram egg (2019) illustrates this. This seems not to readily fit as any kind of “ABCD” in my earlier gesture at taxonomy. Instead, it reveals a “character” that is more important than “content.” Wired’s Louise Matsakis’s explains that the authors exhorted viewers to help “get the most liked post on Instagram.” The post’s character may also have had something to do with consumers’ self-aware frustration at the commercialization of social media, as well as with specific nihilistic meme cultures. The network part of the substrate was also important: the account had roughly 6M followers, itself, and tagged several enormous content-sharing Instagram accounts: what a network theorist would describe as “nodes with very high degrees.”

It may be superficially important that a social media post be sexy, or hilarious, or horrible - what Mcluhan would call attributes of its content. But it is fundamentally important how that content - even just a photo of an egg - has reached its viewer. The shape of the underlying network matters more than the content that moves across it. It is the attributes of the posts’ character, in Mcluhan’s parlance, paired with the shape of the network, that determine how it will move through that network.

The medium is the message. The character underlies the content. Substance follows substrate.

What About AI Slop?

AI Slop is content/substance, but feed ranking algorithms still dictate what character is rewarded. This autumn, we are seeing the emergence of new feeds just for GenAI: Sora, by OpenAI, and Vibes, by Meta, are two big examples. If their ranking logic were laid bare, I would expect to find the same value models we have seen in social media for the past decade, and in sensationalist media for centuries.

The technological advance of GenAI dramatically lowers the barriers for a person to create any kind of content. So far, GenAI content - of any character - all has an underlying character of “AI-ness.” Our ability to sniff that out will wane week by week, until GenAI content is utterly indistinguishable from human-made content. But irrespective of the provenance, substance will continue to appear in front of you as a consequence of your network and the value models that social platforms build.

GenAI image of a Jesus figure made of shrimp.
a pie chart of source of feed content views in the united states