Comment by panarky
17 days ago
It's cheap because it's a Flash model, far smaller and much less compute for inference, runs on TPUs instead of GPUs.
17 days ago
It's cheap because it's a Flash model, far smaller and much less compute for inference, runs on TPUs instead of GPUs.
No comments yet
Contribute on Hacker News ↗