Comment by blinkymach12
4 days ago
We're in a transition phase today where agents need special guidance to understand a codebase that go beyond what humans need. Before long, I don't think they will. I think we should focus on our own project documentation being comprehensive (e.g. the contents of this AGENTS.md are appropriate to live somewhere in our documentation), but we should always write for humans.
The LLM's whole shtick is that it can read and comprehend our writing, so let's architect for it at that level.
It's not just understanding the codebase, it's also stylistic things, like "use this assert library to write tests", or "never write comments", or "use structured logging". It's just as useful --- more so even --- on fresh projects without much code.
Honestly, everything I have written in markdown files as AI context fodder is stuff that I write down for human contributors anyway. Or at least stuff I want to always write down, but maybe only halfway do. The difference now is it is actually being read, seemingly understood, and often followed!
So true. I find myself doing a lot more documentation these days as it is actually having a direct visible benefit. There’s a bit of a mirage here, but hey it’s getting me to document so shhh.
Stylistic preferences could usually be inferred by just looking at the code. Perhaps if code is mid-refactor there may be inconsistencies, but ideal ai coding agent could also look through git history
... most of which would also be valuable information to communicate when onboarding new devs.
Yeah I agree. I think the best place for all this lives in CONTRIBUTING.md which is already a standard-ish thing. I've started adding it even to my private projects that only I work on - when I have to come back in 3 or 4 months, I always appreciate it.
2 replies →
If there were already a universal convention on where to put that stuff, then probably the agents would have just looked there. But there's not, so it was necessary to invent one.
3 replies →
I suspect machine readable practices will become standard as AI is incorporated more into society.
A good example is autonomous driving and local laws / context. "No turn on red. School days 7am-9am".
So you need: where am I, when are school days for this specific school, and what datetime it is. You could attempt to gather that through search. Though more realistically I think the municipality will make the laws require less context, or some machine readable (e.g. qrcode) transfer of information will be on the sign. If they don't there's going to be a lot of rule breaking.
Very strong "reverse centaur" vibes here, in the sense of humans becoming servants to machines, instead of vice versa. Not that I think making things more machine-readable is a waste of time, but you have to keep in mind the amount of human time sacrificed.
Well, it wouldn't even be the first time.
We've completely redesigned society around cars - making the most human populated environments largely worse for humans along the way.
Universal sidewalks (not really needed with slow moving traffic like horses and carts - though nice even back then), traffic lights, stop signs, street crossing, interchanges, etc.
3 replies →
Those particular signs are just stupid. The street should be redesigned with traffic calming, narrowing and chicanes so that speeding is not possible.
Slapping on a sign is ineffective
Maybe for new schools. Old schools don't have the luxury of being able to force adjacent road design changes in most cases. Also. I've frequently seen the school zones extended out in several directions away from the school to make heavily trafficked intersections feeding towards the school safer. Safer for pedestrian and motorist alike. The real world is generally never so black and white. We have to deal with that gray nuance all the time.
1 reply →
Completely agree
That seems anachronistic, form over function. Machines should be able to access an API that returns “signs” for their given location. These signs don’t need any real world presence and can be updated instantly.
Also see this happening, what does that mean for business specifications? Does it become close to code syntax itself?
I think they'll always need special guidance for things like business logic. They'll never know exactly what it is that you're building and why, what the end goal of the project is without you telling them. Architectural stuff is also a matter of human preference: if you have it mapped out in your head where things should go and how they should be done, it will be better for you when reading the changes, which will be the real bottleneck.
Indeed I have observed that my coworkers "never know exactly what it is that [we]'re building and why, what the end goal of the project is without [me] telling them"
I agree with this general sentiment, but there might be some things you want to force into the context every time via a specific agent file.
Not at all. Good documentation for humans are working well for models too, but they need so much more details and context to be reliable than humans that it needs a different style of description.
This needs to contain things that you would never write for humans. They also do stupid things which need to be adjusted by these descriptions.
One of the most common usages I see from colleagues is to get agents to write the comments so you can go full circle. :)
Unless we write down on the what we often consider implicit the LLM will not know it. There might be the option to deduce some implicit requirements from the code but unlikely 100%. Thus making the requirements explicit is the go.
Yes! That was precisely my point here: https://news.ycombinator.com/item?id=44837875
Better to work with the tools we have instead of the tools we might one day have. If you want agents to work well today, you need to build for the agents we have today.
We may never achieve your future where context is unlimited, models are trained on your codebase specifically, and tokens are cheap enough to use all of this. We might have a bubble pop and in a few years we could all be paying 5-10X current prices (read: the actual cost) for similar functionality to today. In that reality, how many years of inferior agent behavior do you tolerate before you give up hoping that it will evolve past needing the tweaks?
> We're in a transition phase today where agents need special guidance to understand a codebase that go beyond what humans need. Before long, I don't think they will.
This isn't guaranteed. Just like we will never have fully self-driving cars, we likely won't have fully human quality coders.
Right now AI coders are going to be another tool in the tool bucket.
I don't think the bar here is a human level coder, I think the bar is an LLM which reads and follows the README.md.
If we're otherwise assuming it reads and follows an AGENTS.md file, then following the README.md should be within reach.
I think our task is to ensure that our README.md is suitable for any developer to onboard into the codebase. We can then measure our LLMs (and perhaps our own documentation) by if that guidance is followed.
Have you taken a Waymo?
Waymo uses a bespoke 3D data representation of the SF roads, does it not? The self-driving car equivalent of an AGENTS.md file.
The limited self-driving cars, with a remote human operator? no, I never have.
11 replies →
> Just like we will never have fully self-driving cars, we likely won't have fully human quality coders.
“Never is a long time...and none of us lives to see its length.” Elizabeth Yates, A Place for Peter (Mountain Born, #3)
“Never is an awfully long time.” J.M. Barrie, Peter Pan
This is mostly true if the existing codebase is largely self documented, which is rare
This applies to mcp too
Here's a prompt I wrote a few days ago for codex:
It did a decent job. I didn't really have much to add to that. I guess, having this file is a nice optimization but obviously it doesn't contain anything it wasn't able to figure out by itself. What's really needed is a per repository learning base that gets populated with facts the agents discovers during it's many experiments with the repository over the course of many conversations. It's a performance optimization.
The core problem is that every conversation is like ground hog day. You always start from scratch. Agents.md is a stop gap solution for that problem. Chatgpt actually has some notional memory that works across conversations. But it's a bit flaky, slow, and limited. It doesn't really learn across conversations.
That btw. is a big missing piece on the path to AGIs. There are some imperfect workarounds but a lot of knowledge is lost in between conversations. And the trick of just growing the amount of context we give to our prompts doesn't seem like it's the solution.
I see the groundhog day problem as a feature, not a bug.
It's an organizational challenge, requiring a top level overview and easy to find sub documentation - and clear directives to use them when the AI starts architecting on a fresh start.
Overall, it's a good sign when a project is understandable in small independent chunks that don't demand a programmer/llm take in more context than was referenced.
I think the sweet spot would be all agents agree on a MUST-READ reference syntax for inside comments & docs that through simple scanning forces the file into the context. eg
// See @{../docs/payment-flow.md} for the overall design.
Your prompt is pretty basic. Both Claude Code and Github Copilot having similar features. Claude Code has `init` which has a lot of special sauce in the prompt to improve the CLAUDE.md. And github copilot added a self-documenting prompt as well that runs on new repos, and you can see their prompt here https://docs.github.com/en/copilot/how-tos/configure-custom-...
Reading their prompt gives ideas on how you can improve yours.