Comment by IncreasePosts

4 hours ago

I do this with my agents. Basically, every "work" oriented call spawns a subprocess which does not add anything to the parent context window. When the subprocess completes the task, I ask it to 1) provide a complete answer, 2) provide a succinct explanation of how the answer was arrived at, 3) provide a succinct explanation of any attempts which did not work, and 4) Anything learned during the process which may be useful in the future. Then, I feed those 4 answers back to the parent as if they were magically arrived at. Another thing I do for managing context window is, any tool/MCP call has its output piped into a file. The LLM then can only read parts of the file and only add that to its context if it is sufficient. For example, execute some command that produces a lot of output and ultimately ends in "Success!", the LLM can just tail the last line to see if it succeeded. If it did, the rest of the output doesn't need to be read. if it fails, usually the failure message is at the end of the log. Something I'm working on now is having a smaller local model summarize the log output and feed that summarization to the more powerful LLM (because I can run my local model for ~free, but it is no where near as capable as the cloud models). I don't keep up with SOTA so I have no idea if what I'm doing is well known or not, but it works for me and my set up.

0 comments

IncreasePosts

No comments yet

Contribute on Hacker News ↗