Comment by viraptor
6 months ago
That looks like a classic Actor/Critic setup, yet it's not mentioned even once in the paper. Am I missing some large difference here?
6 months ago
That looks like a classic Actor/Critic setup, yet it's not mentioned even once in the paper. Am I missing some large difference here?
In actor/critic the actor and critic are normally learned, i.e., their weights are adjusted during the process. The paper is correct that their method is zero-shot, but it doesn't mention that their method is essentially equivalent to a few rounds of training but then discarding the training update.
Anyone who works with deep architectures and momentum-based optimizers knows that the first few updates alone provide large improvements in loss. In this paper the breakthrough is that computing these first few updates at test time enables one to describe the algorithm as "without training" and therefore attract hype.
> discarding the training update
But they aren't updating the model weights. They're iteratively updating the prompt. It's automating the process that humans use with generative models.
Agreed that it's conceptually equivalent though.
Yes, apparently they've developed new names: Generator and Scorer. This feels a bit like "Tai's Model" https://news.ycombinator.com/item?id=17863514
Haha "Tai's Model" is absolutely hilarious, that gave me a good chuckle. I checked and it currently is cited 568 times.