Comment by chipotle_coyote

11 hours ago

> Have you used AI to write documentation for software?

Hi. I have edited AI-generated first drafts of documentation -- in the last few months, so we are not talking about old and moldy models -- and describing the performance as "extremely well" is exceedingly generous. Large language models write documentation the same way they do all tasks, i.e., through statistical computation of the most likely output. So, in no particular order:

- AI-authored documentation is not aware of your house style guide. (No, giving it your style guide will not help.)

- AI-authored documentation will not match your house voice. (No, saying "please write this in the voice of the other documentation in this repo" will not help.)

- The generated documentation will tend to be extremely generic and repetitive, often effectively duplicating other work in your documentation repo.

- Internal links to other pages will often be incorrect.

- Summaries will often be superfluous.

- It will love "here is a common problem and here is how to fix it" sections, whether or not that's appropriate for the kind of document it's writing. (It won't distinguish reliably between tutorial documentation, reference documentation, and cookbook articles.)

- The common problems it tells you how to fix are sometimes imagined and frequently not actually problems worth documenting.

- It's subject to unnecessary digression, e.g., while writing a high-level overview of how to accomplish a task, it will mention that using version control is a good idea, then detour for a hundred lines giving you a quick introduction to Git.

As for using AI "to generate deep research reports by scouring the internet", that sounds like an incredibly fraught idea. LLMs are not doing searches, they are doing statistical computation of likely results. In practice the results of that computation and a web search frequently line up, but "frequently" is not good enough for "deep research": the fewer points of reference for a complex query there are in an LLM's training corpus, the more likely it is to generate a bullshit answer delivered with a veneer of absolute confidence. Perhaps you can make the case that that's still a good place to start, but it is absolutely not something to rely on.

>LLMs are not doing searches, they are doing statistical computation of likely results.

This was true of ChatGPT in 2022, but any modern platform that advertises a "deep research" feature provides its LLMs with tools to actually do a web search, pull the results it finds into context and cite them in the generated text.