Claude wrote a functional NES emulator using my engine's API

1 month ago (carimbo.games)

I'd be curious in how well it passes 100th Coin's NES accuracy tests https://github.com/100thCoin/AccuracyCoin

  • Indeed, that's what I kind of hinted at in https://news.ycombinator.com/item?id=46437688 briefly after, namely that OK, one can "generate" a "solution", that's much easier than before... but until we can verify somehow that it actually does what it say it does (and we know of hallucinations and have no reason to believe this changed) then testing itself, especially of well know "problems" is more and more important.

    That being said, it doesn't answer the "why" in the first place, an even more important question. At least though it does help somehow to compare with existing alternatives.

    • Isn’t this how all software development works? Folks commit code, it’s tested, and reviewed, and then deployed.

      Why would this be any different?

      24 replies →

Git wrote a functional NES emulator for me by simply cloning one of the many publicly available ones!

  • This is the comment.

    Give it copy paste / translate tasks and it’s a no brainer (quite literally)

    But same can be said of humans.

    The question here is, did it implement it because it read the available online documentation about the NES architecture OR did it just see one too many of such implementations.

    • > But same can be said of humans.

      Indeed, the 'cleanroom' standard always was one team does the RE and writes a spec, another team that has never seen the original (and has written statements with penalty clauses to prove it) then does the re-implementation. If you were to read the implementation, write the spec and then write the re-implementation that would be definitely violating the standard for claiming an original work.

It’s a shame that the source code isn’t commented and documented more. At the very least, I would see it being helpful to add some documentation for every CPU op code being emulated.

  • Forbidding LLM to write comments and docstrings (preferrably enforced by build and commit hook) is one of the best "hacks" for using that thing. LLM cannot help itself but emit poisonous comments.

  • Probably better to look at a human-authored emulator if you want comments containing accurate information anyway.

  • If you let it, Claude Code will write a comment for almost every single line of code.

    • Even if you try to get them to not, they will still overcomment the code. Or at least overcomment it from the perspective of a human. From the perspective of the LLM, I suspect the comments are necessary for it to be able to get the code output correct.

      1 reply →

Oh neat, I've been working with claude on an NES emulator in Racket using an SDL3 wrapper also written mostly by Claude.

I tried this a while back using gemini 2.5 pro, round about the time gemini cli was released. I never got the emulator to work in the end, so I dropped the idea.

So this is impressive for me in terms of how fast things have progressed.

Nice, but NES emulator is one of the most written pet projects anywhere, which makes it considerably less impressive.

  • This is a good point. I wonder how much NES emulator code is in Claude's training set? Not to knock what the author has done here, but I wonder if this is more of a softball challenge than it looks.

  • Somewhere along the line the AI bros stopped separating training and testing sets. It's great for impressing the villagers

WASM and the performance seems catastrophically bad (45ms to render a frame on an M4 laptop)? It would be much more impressive if Claude could optimize it into something that someone would actually want to play? Compare this to a random hit from Google, https://jsnes.org/ which has sound, much smaller payload, and runs really fast (<1ms to render a frame).

The cost of slop is >40X drop in performance? Pick any metric that you care about for your domain perhaps that's what you're going to lose and is the effort to recover that practical with current vibe-coding strategies?

  • For me on Firefox/macOS it's terribly slow, fails to initialise/resume sound, no keyboard input.

I will be impressed when new game consoles come to market and it can write the first emulator for it.

Who care what it did. What did you learn? To live is to learn.