Comment by bearjaws
2 days ago
I am sorry but anyone who actually has tried this knows it is horrifically slow, significantly slower than you just typing for any model worth its weight.
That 128gb of RAM is nice but the time to first token is so long on any context over 32k, and the results are not even close to a Codex or Sonnet.
No comments yet
Contribute on Hacker News ↗