← Back to context

Comment by _bin_

10 months ago

I just tried it. It got stuck looping on a `cargo check` call and literally wouldn't do anything else. No additional context, just repeatedly spitting out the same tool call.

The problem is the best models barely clear the bar for some stuff in terms of coherence and reliability; anything else just isn't particularly usable.

This happens when I'm using Claude Code too. Even the best models need humans to get unstuck.

Fron what I've seen most of them are good at writing new code from scratch.

Refactoring is very difficult.

  • I tried it 3-4 times before giving up and it did this every single time. I checked the tool call output and it was running cargo check appropriately. I think maybe the 30b-scale models just aren't sufficient for typical development.

    You're generally correct though, that from-scratch gets better results. This is a huge constraint of them: I don't want a model that will write something its way. I've already gone through my design and settled on the style/principles/libraries I did for a reason; the bot working terribly with that is a major flaw and I don't see saying "let the bot do things its preferred way" as a good answer. Some systems, things like latency matters, and the bot's way just isn't good enough.

    The vast majority of man-hours are maintaining and extending code, not green-fielding new stuff. Vendors should be hyper-focused on this, on compliance with user directions, not with building something that makes a react todo-list app marginally faster or better than competitors.

    • If anything, it's a good sign that these tools are no where close to replacing us.

      I was trying to get postgres working with a project the other day, and Claude decided that it was going to just replace it with SQL lite when it couldn't get the build to work.

      All I want is "I don't know how to do this." But now these tools would rather just do it wrong.

      They also have a very very strong tendency to try and force unoptimized solutions. You'll have 3 classes that do the exact same thing with only minor variable differences. Something a human would do in one class.

      For my latest project I'm strongly tempted to just suck it up and code the whole thing by hand.