← Back to context

Comment by JDye

2 days ago

We have an in-house, Rust-based proxy server. Claude is unable to contribute to it meaningfully outside of grunt work like minor refactors across many files. It doesn't seem to understand proxying and how it works on both a protocol level and business logic level.

With some entirely novel work we're doing, it's actually a hindrance as it consistently tells us the approach isn't valid/won't work (it will) and then enters "absolutely right" loops when corrected.

I still believe those who rave about it are not writing anything I would consider "engineering". Or perhaps it's a skill issue and I'm using it wrong, but I haven't yet met someone I respect who tells me it's the future in the way those running AI-based companies tell me.

> We have an in-house, Rust-based proxy server. Claude is unable to contribute to it meaningfully outside

I have a great time using Claude Code in Rust projects, so I know it's not about the language exactly.

My working model is is that since LLM are basically inference/correlation based, the more you deviate from the mainstream corpus of training data, the more confused LLM gets. Because LLM doesn't "understand" anything. But if it was trained on a lot of things kind of like the problem, it can match the patterns just fine, and it can generalize over a lot layers, including programming languages.

Also I've noticed that it can get confused about stupid stuff. E.g. I had two different things named kind of the same in two parts of the codebase, and it would constantly stumble on conflating them. Changing the name in the codebase immediately improved it.

So yeah, we've got another potentially powerful tool that requires understanding how it works under the hood to be useful. Kind of like git.

  • Recently the v8 rust library changed it from mutable handle scopes to pinned scopes. A fairly simple change that I even put in my CLAUDE.md file. But it still generates methods with HandleScope's and then says... oh I have a different scope and goes on a random walk refactoring completely unrelated parts of the code. All the while Opus 4.5 burns through tokens. Things work great as long as you are testing on the training set. But that said, it is absolutely brilliant with React and Typescript.

    • Well, it's not like it never happened to me to "burn tokens" with some lifetime issue. :D But yeah, if you're working in Rust on something with sharp edges, LLM will get get hurt. I just don't tend to have these in my projects.

      Even more basic failure mode. I told it to convert/copy a bit (1k LOC) of blocking code into a new module and convert to async. It just couldn't do a proper 1:1 logical _copy_. But when I manually `cp <src> <dst>` the file and then told it to convert that to async and fix issues, it did it 100% correct. Because fundamentally it's just non-deterministic pattern generator.

This isn't meant as a criticism, or to doubt your experience, but I've talked to a few people who had experiences like this. But, I helped them get Claude code setup, analyze the codebase and document the architecture into markdown (edit as needed after), create an agent for the architecture, and prompt it in an incremental way. Maybe 15-30 minutes of prep. Everyone I helped with this responded with things like "This is amazing", "Wow!", etc.

For some things you can fire up Claude and have it generate great code from scratch. But for bigger code bases and more complex architecture, you need to break it down ahead of time so it can just read about the architecture rather than analyze it every time.

  • Is there any good documentation out there about how to perform this wizardry? I always assumed if you did /init in a new code base, that Claude would set itself up to maximize its own understanding of the code. If there are extra steps that need to be done, why don't Claude's developers just add those extra steps to /init?

    • Not that I have seen, which is probably a big part of the disconnect. Mostly it's tribal knowledge. I learned through experimentation, but I've seen tips here and there. Here's my workflow (roughly)

      > Create a CLAUDE.md for a c++ application that uses libraries x/y/z

      [Then I edit it, adding general information about the architecture]

      > Analyze the library in the xxx directory, and produce a xxx_architecture.md describing the major components and design

      > /agent [let claude make the agent, but when it asks what you want it to do, explain that you want it to specialize in subsystem xxx, and refer to xxx_architecture.md

      Then repeat until you have the major components covered. Then:

      > Using the files named with architecture.md analyze the entire system and update CLAUDE.md to use refer to them and use the specialized agents.

      Now, when you need to do something, put it in planning mode and say something like:

      > There's a bug in the xxx part of the application, where when I do yyy, it does zzz, but it should do aaa. Analyze the problem and come up with a plan to fix it, and automated tests you can perform if possible.

      Then, iterate on the plan with it if you need to, or just approve it.

      One of the most important things you can do when dealing with something complex is let it come up with a test case so it can fix or implement something and then iterate until it's done. I had an image processing problem and I gave it some sample data, then it iterated (looking at the output image) until it fixed it. It spent at least an hour, but I didn't have to touch it while it worked.

      7 replies →

    • > if you did /init in a new code base, that Claude would set itself up to maximize its own understanding of the code.

      This is definitely not the case, and the reason anthropic doesnt make claude do this is because its quality degrades massively as you use up its context. So the solution is to let users manage the context themselves in order to minimize the amount that is "wasted" on prep work. Context windows have been increasing quite a bit so I suspect that by 2030 this will no longer be an issue for any but the largest codebases, but for now you need to be strategic.

Are you still talking about Opus 4.5 I’ve been working on a Rust, kotlin and c++ and it’s been doing well. Incredible at C++, like the number of mistakes it doesn’t make

> I still believe those who rave about it are not writing anything I would consider "engineering".

Correct. In fact, this is the entire reason for the disconnect, where it seems like half the people here think LLMs are the best thing ever and the other half are confused about where the value is in these slop generators.

The key difference is (despite everyone calling themselves an SWE nowadays) there's a difference between a "programmer" and an "engineer". Looking at OP, exactly zero of his screenshotted apps are what I would consider "engineering". Literally everything in there has been done over and over to the death. Engineering is.. novel, for lack of a better word.

See also: https://www.seangoedecke.com/pure-and-impure-engineering/

  • I don't think it's that helpful to try to gatekeep the "engineering" term or try to separate it into "pure" and "impure" buckets, implying that one is lesser than the other. It should be enough to just say that AI assisted development is much better at non-novel tasks than it is at novel tasks. Which makes sense: LLMs are trained on existing work, and can't do anything novel because if it was trained on a task, that task is by definition not novel.

    • Respectfully, it's absolutely important to "gatekeep" a title that has an established definition and certain expectations attached to the title.

      OP says, "BUT YOU DON’T KNOW HOW THE CODE WORKS.. No I don’t. I have a vague idea, but you are right - I do not know how the applications are actually assembled." This is not what I would call an engineer. Or a programmer. "Prompter", at best.

      And yes, this is absolutely "lesser than", just like a middleman who subcontracts his work to Fiverr (and has no understanding of the actual work) is "lesser than" an actual developer.

      3 replies →

  • It's how you use the tool that matters. Some people get bitter and try to compare it to top engineers' work on novel things as a strawman so they can go "Hah! Look how it failed!" as they swing a hammer to demonstrate it cannot chop down a tree. Because the tool is so novel and it's use us a lot more abstract than that of an axe, it is taking awhile for some to see its potential, especially if they are remembering models from even six months ago.

    Engineering is just problem solving, nobody judges structural engineers for designing structures with another Simpson Strong Tie/No.2 Pine 2x4 combo because that is just another easy (and therefore cheap) way to rapidly get to the desired state. If your client/company want to pay for art, that's great! Most just want the thing done fast and robustly.

    • I think it's also that the potential is far from being realized yet we're constantly bombarded by braindead marketers trying to convince us that it's the best thing ever already. This is tiring especially when the leadership (not held back by any technical knowledge) believes them.

      I'm sure AI will get there, I also think it's not very good yet.

  • Coding agents as of Jan 2026 are great at what 95% of software engineers do. For remaining 5% that do really novel stuff -- the agents will get there in a few years.

  • When he said 'just look at what I'v been able to build', I was expecting anything but an 'image converter'