Comment by lukeundtrug

15 hours ago

Happened to have written both a tool and a blog post about the topic. It’s more about the different technical approaches you have in solving the problem but it might still interest you :)

https://www.context-master.dev/blog/deterministic-semantic-c...

Let me know, what you think

This is interesting - I have been working on the same thing, building contextual data, LSP-style.

I saw the tools page where if I understand right, `get-symbol-context` is actually the main useful tool for what you provide? The others seem more metadata it's easy to get already (?) but that tool provides the extra info.

I had been working on exposing mine as more high-level, ie multiple APIs to query different kinds of metadata about symbols, types, etc. But I am still not sure of the best approach, where my thinking was about not overloading the AI with too many different tools. They accumulate quickly.

  • I definitely share the same sentiment. I don’t want to overload the llm with many tools. Better to have a few opinionated and flexible ones, but yeah, keeping the balance is hard.

    I would say the main two tools are get-symbol-context and get-repository-overview. The latter is actually the more complex and sophisticated one. I’m running some graph algorithms to rank the symbols in terms of relative importance based on centrality metrics, I.e. how well connected they are in the symbol graph.

    The idea behind that is to allow the llm to infer the general structure and architecture of the project with just one tool call.

    I guess you could reach a similar thing if you had some good Agents.md or docs detailing that for your project, but this was more meant to reach that on the fly.

    The symbol-context tool is basically a graph query tool (without a dsl or cipher support yet), but yeah here the question is also whether it makes more sense to give the ai the possibility to run cipher queries itself or abstract it away in a thinner api.

    The main underlying factor of my tool is however the graph that I’m building and the metadata which can be extracted from that (connections, type of connection, etc. ) :)

    Whats the metadata you have in mind?