Comment by MrBlaze

18 days ago

Is there any way to add additional mechanics into your guardrail system? What I'm thinking of is two things.

1. Pair every learned experience with a real memory system, if it took a small model 20x to get a framework done right, all the results With SOLUTION gets stored, every single next session never starts over it gets it right instantly due to proper success records making small models in the long term as fast for repetitive tasks. Isn't THAT what we're all trying to replace is repetitive work?

2. Related to 1, but needed, any way to add in a memory Plugin module that has and can use API, CLI, or lastly MCP endpoints so that every small model has the frameworks built in and capable of being universally connected into any current or future memory system so the model is forced to read and write properly if all it's failures and successes correctly so that essentially it's improvements and speed scale over time having a proper memory system and task success method tracking system it always is forced to update and call from.

3rd. Critical addition: my idea would be for this harness to leverage small local models to 100% take over and REPLACE repetitive large model tasks and this is how.

- any task the small model can not do, it records to the brain/memory. Then the second task is it runs up a large LLM to accomplish the same exact task BUT that large LLM is forced to Write and record and correct ALL the steps the small model got wrong into memory on that task.

- now the small model takes over and directly references That exact architectural method planned and proven by the large language model and should get 100% success and repetitive results forever replacing the need for a large language model in that task again.

What I'm asking is if I/you/we could work on expanding this to having large language models only step in to fix and BLUEPRINT the tasks for the small models to turn take over going forward.

This would radically transform then the small models needed as everything is just following the exact breadcrumbs laid out one time before with a stronger model, than a strong model is never needed again.

Any business would very soon have it's memory system full of all the tasks that business needs and could then be entirely run on local models.

They would only need to pay once for the initial TRAINING period with large models to fill the brain then the small model takes over.

Interested in your thoughts on my ideas

If interested in how this could be implemented and connected into memory systems, I have one of the highest benchmarked memory system that's a hybrid that should be easy to connect into your harness.

Would be super happy to work with you on this if it's of interest.

1 comment

MrBlaze

zambelli 17 days ago

Thanks for the thoughtful comment! Let me try to unpack some of what's there and what's missing.

Forge is at its core a mechanical reliability layer, whereas a lot of memory/skill management would be more of an orchestration component/element that that consumer would own.

That split that has forge stopping at the mechanical layer was an intentional design decision, but there's no reason it couldn't grow into more. I think a lot of what you're thinking about is a big model/small model split similar to how CC does it - but that's an orchestrator.

Now, where Forge can help with what you're suggesting - I think most of it is there, but needs some wiring from the consumer/orchestrator: - Forge surfaces information about which guardrails fired: InferenceResult.new_messages carries typed MessageMeta.type — RETRY_NUDGE, STEP_NUDGE, PREREQUISITE_NUDGE, CONTEXT_WARNING, SUMMARY. So every nudge that fired during a run is observable per-step. A consumer could capture that and compare to workflow steps to reconstruct what success looked like. - Combined with Guardrails.check() > CheckResult, you would have a lot of the journey the model took to get to the answer. - Forge lets you (actually, requires) you to define the system prompt, any workflow restrictions, and the tools. So if you know something about how your task will behave with a small model, you can include that in system prompt, or a tool that's a required step, etc.

For integrations into MCPs/etc that house memories and skills, those can be surfaced to the model with Forge in place. Prompt the model to search for tools in the MCP/surface an MCP tool, etc. I've built a consumer that follows this pattern: main agent gets task > main agent eyeballs whether it can be solved on its own > if not, sends to a subagent specialized on that topic (that has access to more tools related to that) - which allows me to keep context lean for each agent.

You could do something similar where the model is prompted to use its toolset, but if its unsure or needs a tool it doesn't have, to call the get_mcp() tool or something to look for better options.

Big model v small model now - a couple of ways I think about it. - You could use big models to go through your workflow a few times, see common patterns, and then use those to define prerequisite and required steps in Forge guardrails when using small models. - You could use small models the same way there's the ANTHROPIC_SMALL_FAST_MODEL env var in claude code (this is what Explore subagent uses I think). Big model is effectively an orchestrator, and when it recognizes a task is easy, it dispatches a small model to do it, where Forge might make it viable.

Hoepfully that helps! Forge could certainly elevate some of this to be more native - and I might do that - like a mode that packages up results for you so you don't need to reconstruct the nudge events from hooks firing. But everything should be there to integrate with a memory system with the information required, or with an API/MCP that has more tools or skills for the agent to read.

Would love to see the integration if you do it! You'd just need a consumer that captures the events forge returns and packages them up into whatever your memory system is looking for!

If you're looking for other ways of ingesting those memories/skills that isn't system prompt, message, or tool result, then that's something I can look into.