← Back to context

Comment by whizzter

3 months ago

Mainly it points to a non-scientific "bigger is better" mentality, and the researchers probably didn't mind playing around with the power because "scale" is "cool".

Remember that the Lisp AI-labs people were working on non-solved problems on absolute potatoes of computers back in the day, we have a semblance of progress solution but so much of it has been brute-force (even if there has been improvements in the field).

The big question is if these insane spendings has pulled the rug on real progress if we head into another AI winter of disillusionment or if there is enough real progress just around the corner to show that there is hope for investors in a post-deepseek valuation hangover.

We are in a phase where costs are really coming down. We had this phase from GPT2 to about GPT4 where the key to building better models was just building bigger models and training them for longer. But since then a lot of work has gone into distillation and other techniques to make smaller models more capable.

If there is another AI winter, it will be more like the dotcom bubble: lots of important work got done in the dotcom bubble, but many of the big tech companies started from the fruits of that labor in the decade after the bubble burst