Comment by almostgotcaught
4 days ago
> For whom?
what is the name for this kind of pointless, lazy, selective, quoting that willfully misconstrues what's being quoted? the answer to this question is incredibly clear: for the developer that created this tool. if that makes you unhappy enough to malign them then maybe you should just not use it?
> pointless, lazy, selective, quoting that willfully misconstrues what's being quoted
They quoted the part they were replying to. The point was to show what they were asking about. If your question pertains to only a part of the text, it only makes sense to be selective. That's not wilfully misconstruing anything; that’s communicating in a clear, easy-to-follow way. The context is still right up there for reading, for anyone who needs to review it.
> the answer to this question is incredibly clear: for the developer that created this tool
Questions aren’t only ever asked out of pure curiosity; sometimes they’re asked to make the other person give them more consideration. The question you quote was accompanied by an explanation of how the commenter found the approach less simple for them as a user, suggesting that perhaps they think the developer would have done better to consider that a higher priority. (I might add that you, too, chose to selectively omit this context from your quoting—which I personally don’t see as problematic on its own, but the context does require consideration, too.)
> if that makes you unhappy enough to malign them then maybe you should just not use it?
The author of the extension chose to share what they made for others to use. They asked for feedback on user experience and expressed doubt about their design decisions. If someone finds they might not want to use it because of what they consider fundamentally flawed design, why couldn’t they tell the author? It’s not like they were rude or accused them of any wrong-doing (other than possibly making poor design choices).
lol thank you, I was just going to respond to them. One thing I should mention too is that if it were at all practical to build without using generative AI, someone would have built something similar years ago before LLMs.
If there’s any amount of irony in your comment, I’m missing it - and I apologize for that.
That said, people have built this without LLMs years, even decades, ago. But UX has fallen by the wayside for quite some time in the companies that used to build IDEs. Then some fresher devs come along and begin a project without the benefit of experience in a codebase with a given feature … and after some time someone writes a plugin for VSCode to provide documentation tooltips generated by LLM because “there is just no other way it can be done.”
We have language servers for most programming languages. Those language servers provide the tokens one needs to use when referencing the documentation. And it would be so much faster than waiting for an LLM to get back to you.
TBH, if anyone’s excuse is “an LLM is the only way to implement feature Q,” then they’re definitely in need of some experience in software creation.
I don't think you're wrong, but question: it's the weekend, you have an idea for something like this that you want to crank out. Is it really better for you to never ship because it takes a long time to build, or is it better to be able to ship using something like an LLM?
In my opinion the shipped product is better than the unshipped product. While of course I would prefer the version that you have designed, I sure don't have time to build it, and I'm guessing you don't either.
If this was our day jobs and we were being paid for it, it would be a much different story, but this is a hobby project made open source for the world.
3 replies →
I agree that parsing codebases and linking code to documentation is a solved problem. I think @ramon156's suggestion to use tree-sitter or something similar to parse an abstract syntax tree makes sense.
To clarify my earlier point, I wasn't suggesting this is impossible, just that it's not *practical* to build a universal LSP that works with every language and framework out of the box without anything local to index. I don't think an reusing an LSP would be a great fit here either, since LSPs rely on having full project context, dependencies, and type information. These aren't available when analyzing code snippets on arbitrary webpages.
Parsing was never my major concern though. It's the "map tokens to URLs" part. A universal mapping for every token to every piece of documentation on the internet is *impractical* and difficult to maintain. To achieve parity without LLMs, I'd need to write and maintain parsers for every documentation website, and that assumes documentation even exists for most tokens (which it doesn't).
I think kristopolous's suggestion of grounding the LLM with data sources that keep a serialized database of documentation from many different places makes the most sense. That way, the LLM is just extracting and presenting key information from real documentation rather than generating from scratch.
There are probably ways to make this easier. Maybe an offline job that uses LLMs to keep mappings up to date. The project could also be scoped down to a single ecosystem like Rust where documentation is centralized, though that falls apart once you try to scale beyond one language as mentioned above. Maybe I could use raw definition on GitHub combined with an LSP to generate information?
Open to other suggestions on how to bridge this gap.