Comment by faangguyindia
4 days ago
I am developing a coding agent that currently manages and indexes over 5,000 repositories. The agent's state is stored locally in a hidden `.agent` directory, which contains a configuration folder for different agent roles and their specific instructions. Then we've a "agents" folder with multiple files, each file has
<Role> <instruction>
Agent only reads the file if its role is defined there.
Inside project directory, we've a dot<coding agent name> folder where coding agents state is stored.
Our process kicks off with an `/init` command, which triggers a deep analysis of an entire repository. Instead of just indexing the raw code, the agent generates a high-level summary of its architecture and logic. These summaries appear in the editor as toggleable "ghost comments." They're a metadata layer, not part of the source code, so they are never committed in actual code. A sophisticated mapping system precisely links each summary annotation to the relevant lines of code.
This architecture is the solution to a problem we faced early on: running Retrieval-Augmented Generation (RAG) directly on source code never gave us the results we needed.
Our current system uses a hybrid search model. We use the AST for fast, literal lexical searches, while RAG is reserved for performing semantic searches on our high-level summaries. This makes all the difference. If you ask, "How does authentication work in this app?", a purely lexical search might only find functions containing the word `login` and functions/classes appearing in its call hierarchy. Our semantic search, however, queries the narrative-like summaries. It understands the entire authentication flow like it's reading a story, piecing together the plot points from different files to give you a complete picture.
It works like magic.
Working on something similar. Legacy codebase understanding requires this type of annotation, and “just use code comments” is too much of a blunt instrument to too much good. Are you storing the annotations completely out of band wrt the files, or using filesystem capabilities like metadata?
This type of metadata itself could have individual value; there are many types of documents that will be analyzed by LLMs, and will need not only a place to store analysis alongside document-parts, but meta-metadata related to the analysis (like timestamps, models, prompts used etc). Of course this could all be done OOB, but then you need a robust way to link your metadata store to a file that has a lifecycle all its own thats only observable by you (probably).
To add further to your idea:
You can create a hierarchy of summaries. The idea being summaries can exist at the method level, class level and the microservice or module level. Each layer of summary points to its child layers and leaf nodes are code themselves. I think it can be a B tree or a normal tree.
The RAG agent can traverse as deep as needed for the particular semantic query. Each level maintains semantic understanding of the layer beneath it but as a tradeoff it loses a lot of information and keeps only what is necessary.
This will work if the abstractions in the codebase are done nicely - abstractions are only useful if they actually hide implementation details. If your abstractions are good enough, you can afford to keep only the higher layers (as required) in your model context. But if it’s not good, you might even have to put actual code in it.
For instance a method like add(n1, n2) is a strong abstraction - I don’t need to know its implementation but only semantic meaning at this level.
But in real life methods don’t always do one thing - there’s logging, global caches etc.
Tell me more!
The agent I’m developing is designed to improve or expand upon "old codebases." While many agents can generate code from scratch, the real challenge lies in enhancing legacy code without breaking anything.
This is the problem I’m tackling, and so far, the approach has been effective. It's simple enough for anyone to use: the agent makes a commit, and you can either undo, squash, or amend it through the Git UI.
The issue is that developers often skip reviewing the code properly. To counter that, I’m considering shifting to a hunk-by-hunk review process, where each change is reviewed individually. Once the review is complete, the agent would commit the code.
The concept is simple, but the fun lies in the implementation details—like integrating existing CLI tools without giving the agent full shell access, unlike other agents.
What excites me most is the idea of letting 2–3 agents compete, collaborate, and interpret outputs to solve problems and "fight" to find the best solution.
That’s where the real fun is.
Think of it as surgeons scalpel approach rather than "steam roller" approach most agents take.