← Back to context

Comment by stefan_

6 hours ago

Sorry but I think you just don't know a lot about LLMs. Why did they start spamming code with emojis? It's not because that is what people actually do, something that is in the training data. It's because someone reinforcement learned the LLM to do it by asking clueless people if they prefer code with emojis.

And so at this point the excessive bullet points and similar filler trash is also just an expression of whatever stupid people think they prefer.

Maybe I'm being too harsh and it's not the raters are stupid in this constellation, rather it's the ones thinking you could improve the LLM by asking them to make a few very thin judgements.