Top
Best
New

Posted by ahamez 11/2/2025

Why do AI models use so many em-dashes?(www.seangoedecke.com)
98 points | 96 commentspage 2
0xbadc0de5 11/2/2025|
My first thought was watermarking. Same for it's affinity for using emojis in bullet lists.
iainctduncan 11/3/2025||
This has always seemed intuitively obvious to me. I use a lot of em dashes... because I read a lot. Including a lot of older, academic, or more formally written books. And the amount used in AI prose has never struck me as odd for the same reason. (Ditto for semi colons).

The truth is ... most people don't read much. So it's not too surprising they think it looks weird if all they read is posts on the internet, where the average writer has never even learned how to make one on the keyboard.

Delve on the other hand, that shit looks weird. That is waaay over-represented.

redheadednomad 11/3/2025||
"If AI labs wanted to go beyond that, they’d have to go and buy older books, which would probably have more em-dashes."

Actually, they wouldn't have to go and buy these old books: The texts are already available copyright free, due to legislation stating that copyright expires 70 years after the author's death (any book published in the USA before 1923 is also reproducible without adherence to copyright laws), making the full texts of old books much easier to find on the internet!

batterylake 11/4/2025||
Very interesting topic. I also wonder why other signs of AI writing, such as negative parallelism ("It's not just X, it's Y"), are preferred by the models.

Also, I wrote a small extension that automatically replaces ChatGPT responses with em dashes with alternative punctuation marks: https://github.com/nckclrk/rm-em-dashes

spidersouris 11/2/2025||
What we also learned after GPT-3.5 is that, to circumvent the need for new training data, we could simply resort to existing LLMs to generate new, synthetic data. I would not be surprised if the em dash is the product of synthetically generated data (perhaps forced to be present in this data) used for the training of newer models.
keiferski 11/2/2025||
I am no grammarian, but I feel like em-dashes are an easy way to tie together two different concepts without rewriting the entire sentence to flow more elegantly. (Not to say that em-dashes are inelegant, I like them a lot myself.)

And so AI models are prone to using them because they require less computation than rewriting a sentence.

bitshiftfaced 11/2/2025|
This is sort of my thinking too. It's finding next token once the previous ones have been generated. Dashes are an efficient way to continue a thought once you've already written a nearly complete sentence, but it doesn't create a run-on sentence. They're efficient in the sense that they allow more future grammatically correct options even when you've committed to previous tokens.
atoav 11/3/2025||
As someone who used em-dashes extensively before LLMs I can only hope (?) some of myself is in there. I really liked em-dashes, but now I have to actively avoid them, because many people use them as a marker to recognize text that has been invented by the stochastic machine.
shadowvoxing 11/3/2025||
This episode of Big Technology Podcast goes into the reason why:https://pca.st/episode/4090833a-2abd-42b2-a31d-ebb2b4348007
Etheryte 11/2/2025||
Another reason I think attributes to it at least partially is that other languages use em-dashes. Most people use LLMs in English, but that's not the only language they know and many other languages have pretty specific rules and uses for em-dashes. For example, I see em-dashes regularly in local European newspapers, and I would expect those to be written by a human for most part simply because LLM output is not good enough in smaller languages.
stonecharioteer 11/2/2025|
I've been using em-dashes in my own writing for years and it's annoying when I get accused of using AI in my posts. I've since switched to using commas, even though it's not the same.
manuelmoreale 11/2/2025|
You should tell the people that are accusing you to go fuck themselves and you should keep writing the way you like. You were here before AI, don't let it dictate how you behave.
More comments...