Comment by scwoodal
11 hours ago
> I'd rather see a coding agent that can follow steps in a plan file to a T while following guardrails and adhering to the proper coding conventions in the human reviewed spec.
Guardrails/conventions should be enforced in linters, formatters, static analysis tooling; not specs/prompts.
It's not always possible, or at least trivial. For example how do you enforce "prefer to reuse existing code over making a copy"? Is there a static analysis tool that will detect two pieces of code that do the same thing?
Yes it’s possible:
https://github.com/elixir-vibe/ex_dna
lets say you have a table that is partitioned. how do you lint/format "any select into this table MUST include the partition key in the predicate and any join must include it in the on." I'm not personally familiar with any static analysis tool that does this but its trivial to implement with an llm prompt. trivially easy to add to your automated PR reviews.
I would tell the LLM to write a custom rule/check for whatever the scenario is. Then when the CI gate is run, all my custom checks get deterministically run.
Elixir is where I prefer to build software, so it would be creating a custom Credo rule.
https://github.com/rrrene/credo
https://credo.hexdocs.pm/adding_checks.html
Wrong, custom "specs" i.e. schemas, are literally all we have for "real" guardrails with LLMs.
https://developers.openai.com/api/docs/guides/structured-out...
Nothing else operates on the logprobs level and literally bans continuations that fail your schema.
Enforcing structured outputs from LLMs is not the same thing as using linters, formatters, static analysis to control how an agent writes code.
No, it's not. It's strictly better.
1 reply →