Comment by ethbr1

4 months ago

Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)

2 comments

ethbr1

That's correct, but if successful you'd essentially have updated the LLM's knowledge and capabilities "on the fly".

ethbr1 4 months ago

Maybe we could run off-peak load of that nature, when power is cheaper. Call it dreaming. ;)