Comment by manmal
20 hours ago
The SOTA models will always run in data centers, because they have 5x or more VRAM and 10-100x the compute allowance. Plus, they can make good use of scaling w/ batch inference which is a huge power savings, and which a single developer machine doesn’t make full use of.
No comments yet
Contribute on Hacker News ↗