Comment by ethbr1
4 months ago
Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)
4 months ago
Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)
That's correct, but if successful you'd essentially have updated the LLM's knowledge and capabilities "on the fly".
Maybe we could run off-peak load of that nature, when power is cheaper. Call it dreaming. ;)