Comment by crazygringo
13 days ago
Plenty of documentation, and plenty of code that the AI can read itself.
E.g. if a library has a bug that has a common workaround, it can learn that from open source code using the library that uses the workaround.
13 days ago
Plenty of documentation, and plenty of code that the AI can read itself.
E.g. if a library has a bug that has a common workaround, it can learn that from open source code using the library that uses the workaround.
This and the the other thread that talks about RL and synthetic data seem to suggest that AI can figure out all the technical issues without humans looking into them. I'm not sure if that's true at all.
That assumes there is documentation or examples. A big reason Stack Overflow took off was people struggling with things like the Android API documentation.
Some of those discussions made people go figure out how to do it, and then post it as an answer. The knowledge didn't exist anywhere until they did.
It might make sense for AI companies to throw agents at new technologies to trial-and-error their way to internal documentation which they then provide to their models. On the other hand, the people making tomorrow's APIs have LLMs too and that makes documentation ~free. Hallucinations could still bring you back to the first hand, though.
When I talk about code it can learn from, I'm talking about GitHub etc.
Even if stuff isn't in the official documentation, eventually there are projects that use it.
And if the library in question is open-source, then the LLM's can just ingest and read that directly.
Sounds nothing like the world we live in. When has there ever been a time where there were an abundance of software documentation? How can plenty of documentation or code be made if AI scraper bots hammer servers that host them, steal content and drive people away from the actual authors?
The only way I could see this being surfaced the same is if the code essentially had a SO answer written into the doc comment.
What documentation?
lots of undocumented gotchas that only surfaced because someone used it and posted about it