← Back to context

Comment by zozbot234

7 hours ago

Why would power spikes from training runs imply training>>inference? The cost of a training run scales with energy, whereas power is energy per unit time. All that tells you is that they're speeding up their training run so it will take less time overall (probably chasing some first-mover advantage, where they're out with a given model before their closest competitors), whereas they obviously can't do that for inference (which is a steady flow of requests over time).