Comment by RandyOrion

14 days ago

People who downvoted this comment, do you guys really have GPUs with 80GB VRAM or M3 ultra with 512GB rams at home?

I don't. I have no problem not running open-weight models myself because there's an efficiency gap of two orders of magnitude between "pretend-I-can" solution and running them on hundreds of H100s for high thousands of users.