Comment by rbanffy

1 day ago

This is a very important point - the market for training chips might be a bubble, but the market for inference is much, much larger. At some point we might have good enough models and the need for new frontier models will cool down. The big power-hungry datacenters we are seeing are mostly geared towards training, while inference-only systems are much simpler and power efficient.

A real shame, BTW, all that silicon doesn't do FP32 (very well). After training ceases to be that needed, we could use all that number crunching for climate models and weather prediction.

it's already the case that people are eeking out most further gains through layering "reasoning" on top of what existing models can do - in other words, using massive amounts of inference to substitute for increases model performance. Whereever things plateau I expect this will still be the case - so inference ultimately will always be the end game market.

Some more traditional number crunching has long looked at lower- and mixed-precision hardware.