← Back to context

Comment by throw310822

4 days ago

Interesting observation. One one hand, these resemble more the notes that an actual participant would write while solving the problem. Also, less words = less noise, more focus. But also, specifically for LLMs that output one token at a time and have a limited token context, I wonder if limiting itself to semantically meaningful tokens can be create longer stretches of semantically coherent thought?

The original thread mentions “test-time compute scaling” so they had some architecture generating a lot of candidate ideas to evaluate. Minimizing tokens can be very meaningful from a scalability perspective alone!

  • This is just speculation but I wouldn't be surprised if there were some symbolic AI 'tricks'/tools (and/or modern AI trained to imitiate symbolic AI) under the hood.