Comment by asadm
4 days ago
These days, I usually paste my entire (or some) repo into gemini and then APPLY changes back into my code using this handy script i wrote: https://github.com/asadm/vibemode
I have tried aider/copilot/continue/etc. But they lack in one way or the other.
It’s not just about saving money or making less mistakes its also about iteration speed. I can’t believe this process is remotely comparable to aider.
In aider everything is loaded in memory I can add drop files in terminal, discuss in terminal, switch models, every change is a commit, run terminal commands with ! at the start.
Full codebase is more expensive and slower than relevant files. I understand when you don’t worry about the cost, but at reasonable size pasting full codebase can’t be really a thing.
I am at my 5th project in this workflow and these are of different types too:
- an embedded project for esp32 (100k tokens)
- visual inertial odometry algorithm (200k+ tokens)
- a web app (60k tokens)
- the tool itself mentioned above (~30k tokens)
it has worked well enough for me. Other methods have not.
Use a tool like repomix (npm), which has extensions in some editors (at least VSCode) that can quickly bundle source files into a machine readable format
Why not just select Gemini Pro 2.5 in Copilot with Edit mode? Virtually unlimited use without extra fees.
Copilot used to be useless, but over the last few months has become quite excellent once edit mode was added.
copilot (and others) try to be too smart and do context reduction (to save their own wallets). I want ENTIRETY of the files I attached to context, not RAG-ed version of it.
This problem is real.
Claude Projects, chatgpt projects, Sourcegraph Cody context building, MCP file systems, all of these are black boxes of what I can only describe as lossy compression of context.
Each is incentivized to deliver ~”pretty good” results at the highest token compression possible.
The best way around this I’ve found is to just own the web clients by including structured, concatenation related files directly in chat contexts.
Self plug but super relevant: I built FileKitty specifically to aid this, which made HN front page and I’ve continued to improve:
https://news.ycombinator.com/item?id=40226976
If you can prepare your file system context yourself using any workflow quickly, and pair it with appropriate additional context such as run output, problem description etc, you can get excellent results and you can pound away at OpenAI or Anthropic subscription refining the prompt or updating the file context.
I have been finding myself spending more time putting together prompt complexity for big difficult problems, they would not make sense to solve in the IDE.
3 replies →
I believe this is the root of the problem for all agentic coding solutions. They are gimping the full context through fancy function calling and tool use to reduce the full context that is being sent through the API. Problem with this is you can never know what context is actually needed for the problem to be solved in the best way. The funny thing is, this type of behavior actually leads many people to believe these models are LESS capable then they actually are, because people don't realize how restricted these models are behind the scenes by the developers. Good news is, we are entering the era of large context windows and we will all see a huge performance increase in coding as a results of these advancement.
4 replies →
Regarding context reduction. This got me wondering. If I use my own API key, there is no way for the IDE or coplilot provider to benefit other than monthly sub. But if I am using their provided model with tokens from the monthly subscription, they are incentivized to charge me based on tokens I submit to them, but then optimize that and pass on a smaller request to the LLM and get more margin. Is that what you are referring to?
1 reply →
FWIW, Edit mode gives the impression of doing this, vs. originally only passing the context visible from the open window.
You can choose files to include and they don't appear to be truncated in any way. Though to be fair, I haven't checked the network traffic, but it appears to operate in this fashion from day to day use.
3 replies →
Thanks, most people don't understand this fine difference. Copilot does RAG (as all other subscription-based agents like Cursor) to save $$$, and results with RAG are significantly worse than having the complete context window for complex tasks. That's also the reason why Chatgpt or Claude basically lie to the users when they market their file upload functions by not telling the whole story.
Is that why it’s so bad? I’ve been blown away by how bad it is. Never had a single successful edit.
The code completion is chefs kiss though.
1 reply →
Cline doesn’t do this - this is what makes it suitable for working with Gemini and its large context.
Isn't this similar to https://aider.chat/docs/usage/copypaste.html
Just checked to see how it works. It seems that it does all that you are describing. The difference is in the way that it provides the files - it doesn't use xml format.
If you wish you could /add * to add all your files.
Also deducing from this mode it seems that any file that you add to aider chat with /add has its full contents added to the chat context.
But hey I might be wrong. Did a limited test with 3 files in project.
that’s correct. aider doesn’t RAG on files which is good. I don’t use it because 1) UI is so slow and clunky 2) using gemini 2.5 via api in this way (huge context window) is expensive but also heavily rate limited at this point. No such issue when used via aistudio ui.
You could use aider copy-paste with aistudio ui or any other web chat. You could use gemini-2.0-flash for the aider model that will apply the changes. But I understand your first point.
I also understand having build your own tool to fit your own workflow. And being able to easily mold it to what you need.
1 reply →
I felt it loses track of things on really large codebases. I use 16x prompt to choose the appropriate files for my question and let it generate the prompt.
do you mean gemini? I generally notice pretty great recall UPTO 200k tokens. It's ~OK after that.