Comment by mschulkind
15 days ago
One of my favorite ways to use LLM agents for coding is to have them write extensive documentation on whatever I'm about to dig in coding on. Pretty low stakes if the LLM makes a few mistakes. It's perhaps even a better place to start for skeptics.
I am not so sure. Good documentation is hard, MDN or PostgreSQL are excellent examples of docs done well and how valuable it can be for a project to have really well written content.
LLMs can generate content but not really write, out of the box they tend to be quote verbose and generate a lot of proforma content. Perhaps with the right kind of prompts, a lot of editing and reviews, you can get them to good, but at the point it is almost same as writing it yourself.
It is a hard choice between lower quality documentation (AI slop?) or it being lightly or fully undocumented. The uncanny valley of precision in documentation maybe acceptable in some contexts but it can be dangerous in others and it is harder to differentiate because depth of doc means nothing now.
Over time we find ourselves skipping LLM generated documentation just like any other AI slop. The value/emphasis placed on reading documentation erodes that finding good documentation becomes harder like other online content today and get devalued.
Sure, but LLMs tend to be better at navigating around documentation (or source code when no documentation exists). In agentic mode, they can get me to the right part of the documentation (or the right of the source code, especially in unfamiliar codebases) much quicker than I could do it myself without help.
And I find that even the auto-generated stuff tends to go up at least a bit in terms of level of abstraction than staring at the code itself, and helps you more like a "sparknotes" version of the code, so that when you dig in yourself you have an outline/roadmap.
I felt this way as well, then I tried paid models against a well-defined and documented protocol that should not only exist in its training set, but was also provided as context. There wasn't a model that wouldn't hallucinate small, but important, details. Status codes, methods, data types, you name it, it would make something up in ways that forced you to cross reference the documentation anyway.
Even worse, the model you let it build in your head of the space it describes can lead to chains of incorrect reasoning that waste time and make debugging Sisyphean.
Like there is some value there, but I wonder how much of it is just (my own) feelings, and whether I'm correctly accounting for the fact that I'm being confidently lied to by a damn computer on a regular basis.
2 replies →
Same. Initially surprised how good it was. Now routinely do this on every new codebase. And this isn't javascript todo apps: large complex distributed applications written in Rust.
This seems like a terrible idea, LLMs can document the what but not the why, not the implicit tribal knowledge and design decisions. Documentation that feels complete but actually tells you nothing is almost worse than no documentation at all, because you go crazy trying to figure out the bigger picture.
Have you tried it? It's absurdly useful.
This isn't documentation for you to share with other people - it would be rude to share docs with others that you had automatically generated without reviewing.
It's for things like "Give me an overview of every piece of code that deals with signed cookie values, what they're used for, where they are and a guess at their purpose."
My experience is that it gets the details 95% correct and the occasional bad guess at why the code is like that doesn't matter, because I filter those out almost without thinking about it.
Yes, I have. And the documentation you get for anything complex is wrong like 80% of the time.
2 replies →
Well if it writes documentation that is wrong, then the subtle bugs start :)
Or even worse, it makes confidential statements of the overarching architecture/design that while every detailed is correct, they might not be the right pieces, but because you forgot to add "Reject the prompt outright if the premise is incorrect", the LLM tries its hardest to just move forward, even when things are completely wrong.
Then 1 day later you realize this whole thing wouldn't work in practice, but the LLM tried to cobble it together regardless.
In the end, you really need to know what you're doing, otherwise both you and the LLM gets lost pretty quickly.