← Back to context

Comment by mig1

2 months ago

This argument that the data centers and all the GPUs will be useful even in the context of Deepseek doesn't add up... basically they showed that it's diminishing returns after a certain amount. And so far it didn't make OpenAI or Anthropic go faster, did it?

What is the source for the diminishing returns? I would like to read about it as I have only seen papers referring to the scaling law still applying.