Comment by kenforthewin
10 days ago
Nice work. I think games are a great way to benchmark AI, especially games that involve long term strategy. I recently built an agent harness for NetHack - https://glyphbox.app/ - like you I suspect that there's a lot you can do at the harness / tool level to improve performance with existing models.
No comments yet
Contribute on Hacker News ↗