Comment by vunderba

14 days ago

Interesting but frustratingly vague on details. How exactly are the models playing? Is it using some kind of PGN equivalent in Tetris that represents a on-going game, passing an ASCII representation, encoding as a JSON structure, or just directly sending screenshots of the game to the various LLMs?

6 comments

vunderba

storystarling 14 days ago

It has to be turn-based. Even with Flash's speed, the inference latency would kill you in a real-time loop. They're likely pausing the game state after every tick to wait for the API response before resuming.

ykhli 14 days ago

answered this in a comment above! It's not turn or visual layout based since LLMs are not trained that way. The representation is a JSON structure, but LLMs plug in algorithms and keeps optimizing it as the game state evolves

vunderba 14 days ago
Thanks for the clarification! Kind of reminds me of the Brian Moore's AI clocks which uses several LLMs to continuously generate HTML/CSS to create an analog clock for comparisons.
https://clocks.brianmoore.com
- ykhli 14 days ago
  
  Wow this is incredible!!
storystarling 13 days ago

Curious how the token economics compare here to a standard agent loop. It seems like if you're using the LLM as a JIT to optimize the algorithm as the game evolves, the context accumulation would get expensive fast even with Flash pricing.
mhh__ 14 days ago

I suppose you could argue about whether it's an LLM at that point but vision is a huge part of frontier models now, no?