Comment by vessenes

2 months ago

They don’t compare in the paper. I will say I experimented extensively with GPT-3 era LLMs on improving ouput by trying to guide early diffusion models with critical responses. It was a) not successful, and b) pretty clear to me that GPT-3 didn’t “get” what it was supposed to be doing, or didn’t have enough context to keep all this in mind, or couldn’t process it properly, or some such thing.

This paper has ablations, although I didn’t read that section, so you could see where they say the effectiveness comes from. I bet you thought that it’s emergent from a bunch of different places.

FWIW, I don’t think LLMS will solve all our problems, so I too am skeptical of that claim. I’m not skeptical of the slightly weaker “larger models have emergent capabilities and we are probably not done finding them as we scale up”.

1 comment

vessenes

tomrod 2 months ago

> FWIW, I don’t think LLMS will solve all our problems, so I too am skeptical of that claim. I’m not skeptical of the slightly weaker “larger models have emergent capabilities and we are probably not done finding them as we scale up”.

100% agree. I'd classify the time now as identifying the limits of what they can functionally do though, an it's a lot!