Comment by HanClinto
6 days ago
Another thing that I've thought about doing is to use some sort of computer vision to watch streamers of online games and use STT to capture not just play datasets, but also datasets of their narrated reasoning about why they play what they play.
Would be a lot of work to go through and use computer vision and some measure of reasoning to create these datasets, but some players do an excellent job of narrating their reasoning for their players (thinking of players like Cheon or LSV), so would be fascinating.
Caleb Gannon [0] is one such streamer who does a good job of narrating his plays, and he's also a computer scientist who is very interested in machine-learning projects (he's done several of his own). If you contacted him, I could definitely see him being willing to consent to his videos being used as a fine-tuning dataset for such purposes.
I would be willing to help with creating this dataset if you helped me understand what you would like to see in the final output format.
Down the road I can definitely imagine being interested in that (basically split out the "web-based replay viewer" part from the "LLM harness that I want to debug with a replay viewer" part, and then ingest non-LLM games into the viewer), but for now they're super entangled and I'm not prioritizing separating them cleanly. I'll definitely keep this offer in mind for the future, thanks!