Comment by panarky
5 months ago
It's cheap because it's a Flash model, far smaller and much less compute for inference, runs on TPUs instead of GPUs.
5 months ago
It's cheap because it's a Flash model, far smaller and much less compute for inference, runs on TPUs instead of GPUs.
No comments yet
Contribute on Hacker News ↗