← Back to context

Comment by kubb

8 hours ago

How do you check if what it produced is even the right thing? Models love to go chasing the wrong goal based on a reasonable spec.

When the end result has problems and needs to be reworked.

You can't figure this out instantly except when you'd review everything the LLM produces, which I am not. So the round trip time is pretty long, but I can trace it back to the intent now because I commit every architecture decision in an ADRs, which I pour most of my energy into. These are part of the repo.

Using these ADRs helped a lot because most of the assumptions of the LLM get surfaced early on, and you restrict the implementation leeway.

  • Got it. I imagine concurrency bugs will hit hard with this approach because they show up rarely and are hard to debug.

Do they? I haven't experienced models deviating from a spec in a very long time. If anything I feel they are being too conservative and have started to ask to confirm too much.

  • The problem is not the LLM deviating from the plan (though that rarely also happens when it thinks it has a better idea) but rather if the plan is not strict enough and the LLM decides on the fly HOW it is going to build your plan.