Comment by astrange

4 hours ago

> The LLMs appear to be doing exactly what one would expect them to be doing based on their training corpus.

That is not how full LLM training works. That is how base model pretraining works.