← Back to context

Comment by neutronicus

3 hours ago

I've been frustrated with Copilot in this regard.

I work on a large C++ codebase, with large files. Human developers jump around between files with the Visual Studio fuzzy search, set breakpoints to trace execution in the Debugger, use the IDE's refactoring tools.

Microsoft's answer to this was to just ... expose none of this to their Agent Mode!? Replace the working semantic autocomplete with fucking lies!?

Maybe it's changed, I haven't been paying that much attention after bouncing off of this. I've gotten mild acceleration from using gptel-mode in emacs, manually adding references to context, and having models do various mechanical transformations on code. And I've even had some limited success writing tools for it to do LSP lookups.

It frustrates me too, it really feels like the next breakthrough will be when someone gets agents working "natively" with LSP on large code-bases.

Anthropic added LSP support to claude-code, but the current implementation is worse than useless, because any changes aren't reflected fast enough, so it's constantly working on outdated views / compilation caches, and it gets in a right muddle between its "internal" state / understanding in context, the real-world file, and the LSP.

If it could just leverage LSP to apply refactorings it would be amazing, but it feels like the LSP can't keep up, and I don't know if that's an LSP problem or a claude problem.

So we binned the LSP plugin and we're back to watching a machine find/replace, because while waiting on that is slower than LSP, it's a "Action => Wait" which the tooling understands, while LSP is "Possibly Wait for LSP to catch up => Action" which it doesn't understand nearly as well.

I suspect the LSP plugins also need better skills that pair with them so it reaches for them more often.

It hurts my soul to see it reach for find/replace to rename a class, complete with mistakes made in complex solutions where you might have name clashes in different namespaces. Something the LSP handles without problem, but can trip up an LLM.

  • I wonder, is the problem here that LSP is updating too slow all the time? Or just that there’s a chance it will update very slow, and you never really know if you’ll hit that chance, so your model always has to do the “long time wait” just in case? It seems like it ought to be possible for LSP to report that it is still processing, in the latter case, somehow…

    • I'm not an expert, but my reading of the spec is that LSP can handle generic $notifications, but there isn't a specific standard for readiness reporting beyond "Initialize / Initialized", which isn't suitable for monitoring on-going staleness or readiness post-file-detected change, the spec has that as a single first-time initialization.

      There are notifications (i.e. `textDocument/didChange` ) that you can send to the LSP to help it along, but again you might end up racing the notification from the client making the change and any file-watchers you might have running.

      I suspect the answer will come in the form of some kind of more powerful LSP implementations with generous memory caches so that disk changes are just another buffered input that can be disregarded if already stale, no longer seen as the source of truth, and the LSP becomes the real source of truth, so everything can coordinate through it, operating mostly out of memory.

      Another avenue for better success will be more research into faster compilation and better incremental compilation for languages with slower compilation.

      Maybe one day we'll even get AI agents directly manipulating syntax trees, and the code to get there being written back as merely a side-effect, but that seems like sci-fi compared to the current state of play. LSP is still very document based, and of course LLMs are also trained on oodles of source.

I work in Unity and I got frustrated with Claude constantly doing gross bash/grep/awk/sed/grep nested loops that took forever that I finally described (and had Claude implement and install) a tool that could, in a single pass, gather all this info from a Unity forest of scenes at once and answer all the questions Claude ever wanted to ask about a Unity project in a single pass that takes 50ms instead of 10 30 second iterations. It still took a lot of coaching to get it to actually use this tool, but it seems like I’ve convinced it.

  • if it helps, I've found that using context (Claude.md etc) is way less effective for this type of pattern compared to using PreToolHook to capture "bad patterns" and either transparently rewriting them to "do the right thing" if that is possible statically, or if not then rejecting the tool use with a message that tells the agent "how" to use the intended tooling itself.

tool_call is just a fancy wrapper to a black box that executes console commands. Said commands are now the actual backbone of all agentic AI, It feels like the linux people are incredibly vindicated in the single responsibility principle