Comment by lxgr
21 hours ago
Or just by somebody that knows how to use English punctuation properly.
Is it so hard to believe that there are some people in the world capable of hitting option + “-“ on their keyboard (or simply let their editor do it for them)?
I said em dash _and_ the word comprehensive. If you work with LLM generated text enough it gets very easy to see the telltale signs. The emojis at the start of each row in the table are also a dead giveaway.
I am guessing you are one of those people who used em dashes before LLMs came out and are now bitter they are an indicator of LLMs. If that's the case, I am sorry for the situation you find yourself in.
Yes, it’s become a tired trope of a particular kind of LLM luddite to me.
Especially given that there are so many linguistic tics one could pick on instead! “Not x, but y”, the bullseye emoji etc., but instead they get hung up on a typographic character actually widely used, presumably because they assume it only occurs on professionals’ keyboards and nobody would take enough care to use it in casual contexts.
If it makes a difference: it's an en dash used in the readme.
I've been wondering why LLMs seem to prefer the em dash over en dash as I feel like en (or hyphen) is used more frequently in modern text.
In my experience the em dash is still correctly used, the modern style has just evolved to put a space around it.
So:
* fragment a—fragment b (em dash, no space) = traditional
* fragment a — fragment B (em dash with spaces) = modern
* fragment a -- fragment b (two hyphens) = acceptable sub when you can’t get a proper em to render
But en-dashes are for numeric ranges…
1 reply →
It's not an em-dash, it's an en-dash, which is rare in LLM output. Also just stop being insufferable.
> The emojis at the start of each row in the table are also a dead giveaway.
What's up with the green checks, red Xs, rockets, and other stupid emoji in AI slop? Is it an artifact from the cheapest place to do RLHF?
It's the linkedin post recommendation AFAIK. The LI algo pushed such posts to the top before. So my leap of thought is that somebody at MS decided that top LI posts is the go-to structure for "good text".
I have no proof, sorry.
1 reply →