Comment by PlatoIsADisease

12 days ago

>The model absolutely can be run at home.

There is a huge difference between "look I got it to answer the prompt: '1+1='"

and actually using it for anything of value.

I remember early on people bought Macs (or some marketing team was shoveling it), and proposing people could reasonably run the 70B+ models on it.

They were talking about 'look it gave an answer', not 'look this is useful'.

While it was a bit obvious that 'integrated GPU' is not Nvidia VRAM, we did have 1 mac laptop at work that validated this.

Its cool these models are out in the open, but its going to be a decade before people are running them at a useful level locally.

Hear, hear. Even if the model fits, a few tokens per second make no sense. Time is money too.

  • If I can start an agent and be able to walk away for 8 hours, and be confident it's 'smart' enough to complete a task unattended, that's still useful.

    At 3 tk/s, that's still 100-150 pages of a book, give or take.

    • True, that's still faster than a human, but they're not nearly that reliable yet.

  • Maybe for a coding agent, but a daily/weekly report on sensitive info?

    If it were 2016 and this technology existed but only in 1 t/s, every company would find a way to extract the most leverage out of it.