Comment by momo26

21 days ago

I was wondering that will modifying prompts or contracting the context also impact the performance? It may mistake the original meaning, and these steps also need help from external LLM.

Forge doesn't modify the prompt, it just injects information into the conversation as if it was a conversation turn. Over many turns - it can degrade the model (a concept I'm calling "effective attention"). But that requires serious context growth that really only becomes relevant for long-running agentic coding tasks in my experience. Still, it's possible.

Context compaction can also affect the outcome - I have eval scenarios for that as well but not in the published set, only in the repo. For those, I'd say "it's better than nothing". If you hit max context, the whole thing will barf or OOM the rig or something like that. So compaction degrades performance versus some theoretical ideal where you never need to, certainly. But it's better than a hard failure. Eval on those scenarios showed increasing degradation depending on severity of compaction. I view the auto-compaction as insurance. I never give the models tasks that will require that much context, but if it ends up getting there then the run might be saved.