Comment by AndyNemmity

1 month ago

I got similar feedback with my first blog post on my do router - https://vexjoy.com/posts/the-do-router/

Here is what I don't get. it's trivial to do this. Mine is of course customized to me and what I do.

The idea is to communicate the ideas, so you can use them in your own setup.

It's trivial to put for example, my do router blog post in claude code and generate one customized for you.

So what does it matter to see my exact version?

These are the type of things I don't get. If I give you my details, it's less approachable for sure.

The most approachable thing I could do would be to release individual skills.

Like I have skills for generating images with google nano banana. That would be approachable and easy.

But it doesn't communicate the why. I'm trying to communicate the why.

11 comments

AndyNemmity

majormajor 1 month ago

I just don't have much faith in "if you're doing it right the results will be magically better than what you get otherwise" anymore. Any single person saying "the problems you run into with using LLMs will be solved if you do it my way" has to really wow me if they want me to put in effort on their tips. I generally agree with your why of why you set up like that. I'm skeptical that it will get over the hump of where I still run into issues.

When you've tried 10 ways of doing it but they all end up getting into a "feed the error back into the LLM and see what it suggests next" you aren't that motivated to put that much effort into trying out an 11th.

The current state of things is extremely useful for a lot of things already.

AndyNemmity 1 month ago
That's completely fair, I also don't have much faith in that anymore. Very often, the people who make those claims have the most basic implementation that barely is one.
I'm not sure if the problems you run into with using LLMs will be solved if you do it my way. My problems are solved doing it my way. If I heard more about your problems, I would have a specific answer to them.
These are the solutions to where I have run into issues.
For sure, but my solutions are not feed the error back into the LLM. My solutions are varied, but as the blog shows, they are move as much as possible into scripts, and deterministic solutions, and keep the LLM to the smallest possible scope.
The current state of things is extremely useful for a subset of things. That subset of things feels small to me. But it may be every thing a certain person wants to do exists in that subset of things.
It just depends. We're all doing radically different things, and trying very different things.
I certainly understand and appreciate your perspective.
- majormajor 1 month ago
  
  That makes sense.
  My basic problem is: "first-run" LLM agent output frequently does one or more of the following: fails to compile/run, fails existing test coverage, or fails manual verification. The first two steps have been pretty well automated by agents: inspect output, try to fix, re-run. IME this works really well for things like Python, less-well for things like certain Rust edge cases around lifetimes and such, or goroutine coordination, which require a different sort of reasoning than "typical" procedural programming.
  But let's assume that the agents get even better at figuring out the deal with the more specialized languages/features and are able to iterate w/o interaction to fix things.
  If the first-pass output still has issues, I still have concerns. They aren't "I'm not going to use these tools" concerns, because I also sometimes write bugs, and they can write the vast majority of code faster than I can.
  But they are "I'm not gonna vibe-code my day job" concerns because the existence of trivially-catchable issues suggests that there's likely harder-to-catch issues that will need manual review to make sure (a) test coverage is sufficient, (b) the mental model being implemented is correct, (c) the outside world is interacted with correctly. And I still find bugs in these areas that I have to fix manually.
  This all adds up to "these tools save me 20-30% of my time" (the first-draft coding) vs "these agents save me 90% of my time."
  So I'm kinda at a plateau for a few months where it'll be hard to convince me to try new things to try to close that 20-30% -> 90% number.
  
  2 replies →
- ok_dad 1 month ago
  
  Damn, it really is all just vibes eh? Everyone just vibes their way to coding these days, no proof AI is actually doing anything for you. It's basically just how someone feels now: that's reality.
  In some sense, computers and digital things have now just become a part of reality, blending in by force.
  
  5 replies →