Comment by panarky
7 months ago
It's cheap because it's a Flash model, far smaller and much less compute for inference, runs on TPUs instead of GPUs.
7 months ago
It's cheap because it's a Flash model, far smaller and much less compute for inference, runs on TPUs instead of GPUs.
No comments yet
Contribute on Hacker News ↗