Comment by vessenes

6 days ago

okay hold my beer. both claude and codex running now.

EDIT: both agents took about 20 minutes. I used that exact prompt in a clean directory for each, and then said "deploy to netlify" - so a total of two prompts.

Codex: https://astounding-bavarois-27b5a2.netlify.app

Claude: http://strong-hotteok-91dfb0.netlify.app

Netlify is having trouble claiming the Claude project, so if you need a password it's "My-Drop-Site"

FYI, Claude rated itself 7.7/10 for fun, and Codex 98/100 during the fun test loop. As you'll see if you poke at them, Claude needs a physics bug fix round. But I think these both did about what I would have expected.

Nice, very retro (looking at the codex one)!

Claude one doesn't really work (collision detection was the problem I had before too), but fairly close.

Yes when I tried previously I had a few gameplay issues in frogger and I couldn't manage to one-shot this sort of thing at the time (a year ago), so last year definitely saw some good progress at this sort of thing. The asteroids game I was very happy with though, had a very cool retro feel and was wireframe only. Wasn't so keen on the code produced as it had a patchwork feel to it.

  • To your point, I didn't even look at the code.. :) Okay, I looked at the codex code. it's super reasonable -- separation of concerns, operating on a state model, it's not over designed. I did not hate it. I also noted that codex put in a CRT simulator loop which is a nice touch.

    I think a year ago this would have taken a lot of back and forth and arguing; to me that's kind of the point of Simon's article -- a lot more just 'works' now.

    • Sorry I meant the code a year ago - it took a bit more hand-holding at that point and it was a mishmash of different things, but I feel it’s just slightly easier now - still similar. Haven’t looked into this one just had a quick play. Thanks for trying it out!

      I think his article is for the last 6 months - my feeling is progress with LLMs has stalled recently and generated code still has problems with accuracy and coherence and subtle bugs, but everyone has a different experience.

      1 reply →

Frogger is kind of too well known such that there is ample training data for building that specific game.

The game I was thinking of is relatively obscure -> Panel de Pon

  • Yes I was surprised at the time that it failed so badly at Frogger, I think from memory it was colission detection it just couldn't get right, plus the positioning of various game elements as it has quite a lot going on (the examples above still have some problems with these things). I thought there would be open source examples out there in js/html but perhaps not so much for frogger.