← Back to context

Comment by cedws

6 months ago

I had a similar idea[0], interesting to see that it actually works. The faster LLM workloads can be accelerated, the more ‘thinking’ the LLM can do before it emits a final answer.

[0]: https://news.ycombinator.com/item?id=41377042