Comment by bonsai_bar

2 days ago

Yeah, that was a pretty lazy response on my part. Let me try again.

In my opinion, it takes several weeks of active use to nail down your preferred workflow with these tools and to get a meaningful understanding of their abilities and limitations.

I.e., yes they hallucinate and don't have great understanding of truth/fact (however you choose to define those terms), but you need to develop an intuition for how to work around those issues and how to recognize the problems in your setup that increase the likelihood of the LLM heading down false paths. This intuition cannot come until you fight through the initial struggle period.

In some ways, it's similar to picking up emacs/vim and learning the shortcuts. It's a negative to your velocity until it's not, and once you overcome that initial hurdle, your productivity takes off. Admittedly, it's not for everyone (I never bothered to learn the ins and outs of vim bindings because my bottleneck isn't my speed of writing code), but it provides a huge productivity boost for those types of engineers.

Coming back to my main point: your LLM needs quite a bit of guidance in the early stages, especially as you're feeling out what types of tasks it's able to knock out the park and what types of tasks it'll struggle with. For instance, in the example you gave here, I wonder what would happen if you asked it to present you a detailed plan before it gets to writing any code and to provide a list of assumptions it is making? You will need to do a bit of review with it before you let it go execute the plan (siilar to how a junior engineer would come to you with questions before being able to handle certain tasks).

I also recommend writing up a thorough self-review checklist that it stored in your repo (e.g. in an AGENTS.md file) that provides the customized instructions you want your LLM to follow (it won't always do so but it helps a ton). Otherwise, each new session is essentially starting over without it learning, which is pretty frustrating.

I'm happy to talk more because I'm pretty optimistic about LLMs and enjoy using them in my day-to-day where appropriate.

And finally, I'm not sure how much you've thought about giving them more autonomy, but I do recommend doing so if you have a safe, sandboxed environment. The real magic and productivity boost of LLMs come when you give them some more autonomy and provide them with tools to figure out the problems they encounter, unlocking your time to be spent on higher-leverage tasks such as designing systems and processes. If it can run linters, unit tests, and grep your codebase during its development process and use this to iterate, you'll have a much more fun time.

Does this help?

1 comment

bonsai_bar

camgunz 8 hours ago

Thanks! Uh, it helps a little. I'll see about making an AGENTS.md. I guess I can agree that there's something about like, maximizing the amount of good output from an agent, and to do that you need to give it a lot of access and info. I thought I did that though by giving it the whole codebase and linking pubsub docs, maybe the job I gave it was too small, but that's hard for me to accept: I wanted an entirely new module, is the next step like, ask it to make all the modules? What if I only need one?