Comment by jjcm
5 hours ago
"Instant" is really going to age poorly as far as a brand name goes, especially with Taalas ( https://chatjimmy.ai ) proving out that baked silicon models can be truly instant.
I was literally posting about this earlier this morning[1], but all data indicates that we'll have models equivalent to Opus 4.6 / GPT 5.3 with a truly instant (ie > 10k t/s) response time by 2028. Small models are getting better faster, and their ability to be baked into silicon in a power and speed efficient way is likely going to completely disrupt things.
No comments yet
Contribute on Hacker News ↗