Comment by nemothekid

12 days ago

I think I should write more about but I have been feeling very similar. I've been recently exploring using claude code/codex recently as the "default", so I've decided to implement a side project.

My gripe with AI tools in the past is that the kind of work I do is large and complex and with previous models it just wasn't efficient to either provide enough context or deal with context rot when working on a large application - especially when that application doesn't have a million examples online.

I've been trying to implement a multiplayer game with server authoritative networking in Rust with Bevy. I specifically chose Bevy as the latest version was after Claude's cut off, it had a number of breaking changes, and there aren't a lot of deep examples online.

Overall it's going well, but one downside is that I don't really understand the code "in my bones". If you told me tomorrow that I had optimize latency or if there was a 1 in 100 edge case, not only would I not know where to look, I don't think I could tell you how the game engine works.

In the past, I could not have ever gotten this far without really understanding my tools. Today, I have a semi functional game and, truth be told, I don't even know what an ECS is and what advantages it provides. I really consider this a huge problem: if I had to maintain this in production, if there was a SEV0 bug, am I confident enough I could fix it? Or am I confident the model could figure it out? Or is the model good enough that it could scan the entire code base and intuit a solution? One of these three questions have to be answered or else brain atrophy is a real risk.

16 comments

nemothekid

bedrio 12 days ago

I'm worried about that too. If the error is reproducible, the model can eventually figure it out from experience. But a ghost bug that I can't pattern? The model ends up in a "you're absolutely right" loop as it incorrectly guesses different solutions.

mattmanser 12 days ago
Are ghost bugs even real?
My first job had the Devs working front-line support years ago. Due to that, I learnt an important lessons in bug fixing.
Always be able to re-create the bug first.
There are no such thing as ghost bugs, you just need to ask the reporter the right questions.
Unless your code is multi-threaded, to which I say, good luck!
- chickensong 12 days ago
  
  They're real at scale. Plenty of bugs don't suface until you're running under heavy load on distributed infrastructure. Often the culprit is low in the stack. Asking the reporter the right questions may not help in this case. You have full traces, but can't reproduce in a test environment.
  When the cause is difficult to source or fix, it's sometimes easier to address the effect by coding around the problem, which is why mature code tends to have some unintuitive warts to handle edge cases.
- yencabulator 12 days ago
  
  > Unless your code is multi-threaded, to which I say, good luck!
  What isn't multi-threaded these days? Kinda hard to serve HTTP without concurrency, and practically every new business needs to be on the web (or to serve multiple mobile clients; same deal).
  All you need is a database and web form submission and now you have a full distributed system in your hands.
  
  4 replies →
- SpicyLemonZest 12 days ago
  
  Historically I would have agreed with you. But since the rise of LLM-assisted coding, I've encountered an increasing number of things I'd call clear "ghost bugs" in single threaded code. I found a fun one today where invoking a process four times with a very specific access pattern would cause a key result of the second invocation to be overwritten. (It is not a coincidence, I don't think, that these are exactly the kind of bugs a genAI-as-a-service provider might never notice in production.)

mh2266 12 days ago

> I've been trying to implement a multiplayer game with server authoritative networking in Rust with Bevy. I specifically chose Bevy as the latest version was after Claude's cut off, it had a number of breaking changes, and there aren't a lot of deep examples online.

I am interested in doing something similar (Bevy. not multiplayer).

I had the thought that you ought be able to provide a cargo doc or rust-analyzer equivalent over MCP? This... must exist?

I'm also curious how you test if the game is, um... fun? Maybe it doesn't apply so much for a multiplayer game, I'm thinking of stuff like the enemy patterns and timings in a soulslike, Zelda, etc.

I did use ChatGPT to get some rendering code for a retro RCT/SimCity-style terrain mesh in Bevy and it basically worked, though several times I had to tell it "yeah uh nothing shows up", at which point is said "of course! the problem is..." and then I learned about mesh winding, fine, okay... felt like I was in over my head and decided to go to a 2D game instead so didn't pursue that further.

nemothekid 12 days ago
>I had the thought that you ought be able to provide a cargo doc or rust-analyzer equivalent over MCP? This... must exist?
I've found that there are two issues that arise that I'm not sure how to solve. You can give it docs and point to it and it can generally figure out syntax, but the next issue I see is that without examples, it kind of just brute forces problems like a 14 year old.
For example, the input system originally just let you move left and right, and it popped it into an observer function. As I added more and more controls, it began to litter with more and more code, until it was ~600 line function responsible for a large chunk of game logic.
While trying to parse it I then had it refactor the code - but I don't know if the current code is idiomatic. What would be the cargo doc or rust-analyzer equivalent for good architecture?
Im running into this same problem when trying to claude code for internal projects. Some parts of the codebase just have really intuitive internal frameworks and claude code can rip through them and provide great idiomatic code. Others are bogged down by years of tech debt and performance hacks and claude code can't be trusted with anything other than multi-paragraph prompts.
>I'm also curious how you test if the game is, um... fun?
Lucky enough for me this is a learning exercise, so I'm not optimizing for fun. I guess you could ask claude code to inject more fun.
- azrazalea_debt 12 days ago
  
  > What would be the cargo doc or rust-analyzer equivalent for good architecture?
  Well, this is where you still need to know your tools. You should understand what ECS is and why it is used in games, so that you can push the LLM to use it in the right places. You should understand idiomatic patterns in the languages the LLM is using. Understand YAGNI, SOLID, DDD, etc etc.
  Those are where the LLMs fall down, so that's where you come in. The individual lines of code after being told what architecture to use and what is idiomatic is where the LLM shines.
  
  1 reply →

storystarling 12 days ago

I ran into similar issues with context rot on a larger backend project recently. I ended up writing a tool that parses the AST to strip out function bodies and only feeds the relevant signatures and type definitions into the prompt.

It cuts down the input tokens significantly which is nice for the monthly bill, but I found the main benefit is that it actually stops the model from getting distracted by existing implementation details. It feels a bit like overengineering but it makes reasoning about the system architecture much more reliable when you don't have to dump the whole codebase into the context window.

jv22222 12 days ago

> I don't really understand the code "in my bones".

Man, I absolutely hate this feeling.