← Back to context

Comment by embedding-shape

2 days ago

> If someone buys one of these $8000 GPUs to run GLM-4.7, they're going to be immensely disappointed. This is my point.

Absolutely, same if they get a $10K Mac/Apple computer, immense disappointment ahead.

Best is of course to start looking at models that fit within 96GB, but that'd make too much sense.

$10k is > 4 years of a $200/mo sub to models which are currently far better, continue to get upgraded frequently, and have improved tremendously in the last year alone.

This almost feels like a retro computing kind of hobby than anything aimed at genuine productivity.

  • I don't think the calculation is that simple. With your own hardware, there literally is no limits of runtime, or what models you use, or what tooling you use, or availability, all of those things are up to you.

    Maybe I'm old school, but I prefer those benefits over some cost/benefit analysis across 4 years which by the time we're 20% through it, everything has changed.

    But I also use this hardware for training my own models, not just inference and not just LLMs, I'd agree with you if we were talking about just LLM inference.