• The Hub

    News, Notes, Talk

    Here’s a handy guide to help you spot AI writing.

    James Folta

    September 26, 2025, 10:45am

    Source image from Wikimedia Foundation

    It’s always a bit surprising to me how trusted Wikipedia has become, since I spent my entire childhood being told by adults to never, ever, trust it. But the site has become an indispensably reliable and vibrant archive. It’s easy to search, rigorously cited, and full of fun rabbit holes to get lost in. It’s a big win for anyone who’s still holding a grudge against their middle school teachers.

    This was not inevitable, and lots of people’s work led to this reputational shift. “We know we have a responsibility to get it right and particularly in this era when there’s so much misinformation,” co-founder of Wikipedia Jimmy Wales told BBC in 2021, “people trust us to at least try to get it right, and that’s more than you can say for some places.” (And for more on the site’s reliability over the years, Wikipedia’s got you covered, of course.)

    One of the biggest threats to the usability and trustworthiness of the internet right now is AI. Large language models are unleashing a flood of slop that is threatening to degrade or even kill the internet. Wikipedia is actively wrestling with this threat, and put together a guide on signs of AI writing, a “catalog of very common patterns observed over many thousands of instances of AI-generated text, specific to Wikipedia.” It’s worth a look for anyone looking to be savvier at spotting the robot writing.

    Wikipedia sorts the issues into a bunch of large buckets, all citing real examples of AI writing uncovered on Wikipedia. “Language and tone” includes bad syntax and language that AI trends towards; “Style” includes AI tics like lists, emojis, and the now infamous em dashes; and “Communication intended for the user,” includes things a lazy editor might leave in their pasted AI text, like prompts or an LLM’s cheery sycophantic replies.

    These sections have some interesting inferences on how AI functions based on its programming and training material. For example, the servile positivity of many of these models is based on it’s trending towards a mean. LLMs “will thus sand down specific, unusual, nuanced facts (which are statistically rare) and replace them with more generic, positive descriptions (which are statistically common,” which leads to something akin to “shouting louder and louder that a portrait shows a uniquely important person, while the portrait itself is fading from a sharp photograph into a blurry, generic sketch.”

    The final sections, “Markup” and “Citations,” are more technical and Wikipedia specific, addressing formatting mistakes and backend clues that an AI has been tinkering in an article, which will probably only make sense to dedicated Wikipedians.

    None of this is intended as a prescriptive sale guide for Wikipedia. “This list is not a ban on certain words, phrases, or punctuation,” they write, but rather it’s a way of spotting “potential signs of a problem, not the problem itself.” That is, AI hiccups “can point to less outwardly visible problems that carry much more serious policy risks.”

    I get the sense that Wikipedia isn’t categorically opposed to LLMs, but that they have yet to be convinced that these tools are good enough to produce text to their standards. And beyond that, it’s a threat to their core function. AI’s inability to perform the tasks asked of it by overly credulous users is a threat to the functionality and user trust of Wikipedia, and one that might go unnoticed if editors aren’t careful. Wikipedia also doesn’t want for-profit companies to use their site as a punching bag to strengthen their products. An interesting essay on Wikipedia and LLMs, includes the great line that “Wikipedia is not a testing ground.”

    It’s a shame that Wikipedia has to spend so much energy on generative vandalism, but I’m impressed by their serious commitment to maintaining a reliable encyclopedia that can be a service to anyone—imagine if other tech companies saw their services in the same light?

    These models will no doubt continue to change the ways they crank out text, so there will never be one surefire way to tell cranker slop from human writing. Plenty of us meat writers love to use em dashes, too.

    Articles like this, that catalogue AI’s tells, are going to be helpful resources for all web surfers who are invested in a usable, hand-crafted internet. Trillions of dollars are being poured into helping AI camouflage itself with our language. Being a bit more discerning as readers might help mitigate some of AI’s overreach, and would be a good thing in general. If there’s one thing I’m always hoping up on a soapbox about (that is, texting my friends incessantly about), it’s that we all need much better media literacy skills.

    Give “signs of AI writing” a look, and keep reading critically!

  • We Need Your Help:

    Become a Lit Hub Supporting Member

    Lit Hub has always brought you the best of the book world for free—no paywall. But our future relies on you. In return for your contribution, you'll get an ad-free site experience, editors' picks, and our Joan Didion tote bag. Most importantly, you'll keep independent book coverage alive and thriving.