Comment by foxes
13 hours ago
Are em dashes in language models particularly close to a start token or something? Somehow letting the model continue to keep outputting.
13 hours ago
Are em dashes in language models particularly close to a start token or something? Somehow letting the model continue to keep outputting.
I think it's mainly a matter of clarity as long embedded clauses without obvious visual delimiting can be hard to read and thus are discouraged in professional writing aiming for ease of reading from a wide audience. LLMs are trained on such a style.