← Back to context

Comment by vessenes

6 days ago

Out of curiosity - what harness did you use, and what model? And how are you prompting? In my mind prompting like:

“You’re going to make frogger in javascript. I want a complete clone of functionality for level 1, with amazing 80s era pixel art sprites. I’m super lazy, so you’re going to have to test everything, right from the start. Pick a test harness, write the tests, including tests for having amazing graphics, gameplay, input, UI, sounds, etc, and write a full workplan, then work through that workplan, in parallel where you can. The workplan should emphasize getting a stripped down version up immediately and have workstreams for all the major requirements after that. Add a final test that assesses how fun the game is by reviewing a real video of a test run. Loop on that final test until you can’t improve things any more.”

Should produce something playable with no further input. As you say, I’m not sure it would produce a codebase we’d want to look at or work on. But, I’d be surprised if this weren’t successful.

Sure give it a go, perhaps it will work better now with frontier models, I haven't tried it in a while (this was a year ago, things have improved since then). I'm not sure what tests for having amazing graphics, gameplay, input, UI, sounds, etc would look like, but it would be interesting to see the results!

  • okay hold my beer. both claude and codex running now.

    EDIT: both agents took about 20 minutes. I used that exact prompt in a clean directory for each, and then said "deploy to netlify" - so a total of two prompts.

    Codex: https://astounding-bavarois-27b5a2.netlify.app

    Claude: http://strong-hotteok-91dfb0.netlify.app

    Netlify is having trouble claiming the Claude project, so if you need a password it's "My-Drop-Site"

    FYI, Claude rated itself 7.7/10 for fun, and Codex 98/100 during the fun test loop. As you'll see if you poke at them, Claude needs a physics bug fix round. But I think these both did about what I would have expected.

    • Nice, very retro (looking at the codex one)!

      Claude one doesn't really work (collision detection was the problem I had before too), but fairly close.

      Yes when I tried previously I had a few gameplay issues in frogger and I couldn't manage to one-shot this sort of thing at the time (a year ago), so last year definitely saw some good progress at this sort of thing. The asteroids game I was very happy with though, had a very cool retro feel and was wireframe only. Wasn't so keen on the code produced as it had a patchwork feel to it.

      3 replies →

    • Frogger is kind of too well known such that there is ample training data for building that specific game.

      The game I was thinking of is relatively obscure -> Panel de Pon

      1 reply →