Comment by ethbr1
18 days ago
Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)
18 days ago
Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)
That's correct, but if successful you'd essentially have updated the LLM's knowledge and capabilities "on the fly".
Maybe we could run off-peak load of that nature, when power is cheaper. Call it dreaming. ;)