Comment by viraptor

3 months ago

That looks like a classic Actor/Critic setup, yet it's not mentioned even once in the paper. Am I missing some large difference here?

4 comments

viraptor

dawnofdusk 3 months ago

In actor/critic the actor and critic are normally learned, i.e., their weights are adjusted during the process. The paper is correct that their method is zero-shot, but it doesn't mention that their method is essentially equivalent to a few rounds of training but then discarding the training update.

Anyone who works with deep architectures and momentum-based optimizers knows that the first few updates alone provide large improvements in loss. In this paper the breakthrough is that computing these first few updates at test time enables one to describe the algorithm as "without training" and therefore attract hype.

fc417fc802 3 months ago

> discarding the training update
But they aren't updating the model weights. They're iteratively updating the prompt. It's automating the process that humans use with generative models.
Agreed that it's conceptually equivalent though.

oneseven 3 months ago

Yes, apparently they've developed new names: Generator and Scorer. This feels a bit like "Tai's Model" https://news.ycombinator.com/item?id=17863514

lukeinator42 3 months ago

Haha "Tai's Model" is absolutely hilarious, that gave me a good chuckle. I checked and it currently is cited 568 times.