Comment by sandGorgon
5 days ago
this is actually a very valid technique. We do the same (as an rl environments provider).
Except we bundle it with a custom browser renderer which actually generates rewards based on dom diff...and not screenshot based.
the browser renderer is opensource https://github.com/wootzapp/wootz-browser
No comments yet
Contribute on Hacker News ↗