Comment by vunderba
7 hours ago
Interesting but frustratingly vague on details. How exactly are the models playing? Is it using some kind of PGN equivalent in Tetris that represents a on-going game, passing an ASCII representation, encoding as a JSON structure, or just directly sending screenshots of the game to the various LLMs?
It has to be turn-based. Even with Flash's speed, the inference latency would kill you in a real-time loop. They're likely pausing the game state after every tick to wait for the API response before resuming.
answered this in a comment above! It's not turn or visual layout based since LLMs are not trained that way. The representation is a JSON structure, but LLMs plug in algorithms and keeps optimizing it as the game state evolves
Thanks for the clarification! Kind of reminds me of the Brian Moore's AI clocks which uses several LLMs to continuously generate HTML/CSS to create an analog clock for comparisons.
https://clocks.brianmoore.com
Wow this is incredible!!
I suppose you could argue about whether it's an LLM at that point but vision is a huge part of frontier models now, no?