← Back to context

Comment by trollbridge

7 hours ago

oh-my-pi can delegate tasks to other models too. I usually use DS4 Flash for low priority subagent tasks.

If Fable is "delegating" tasks, then there's actually an agent front end of whatever you think the API is.

We have a local instance of Qwen-3.6 which is more than adequate for running agents. You can mix and match local and cloud-hosted models. (My biggest use case for local models right now is vision models because they're quite small and I can avoid some data-locality issues my customers wouldn't be comfortable with if I sen them to a Chinese model.)

> If Fable is "delegating" tasks, then there's actually an agent front end of whatever you think the API is.

I would say behind (I believe you use the API just like you do Opus), but yeah. I'm not claiming it's a property of the LLM itself, I also presume this is some variety of tool calling agent harness.

> We have a local instance of Qwen-3.6 which is more than adequate for running agents. You can mix and match local and cloud-hosted models.

I'm presuming OP meant local as in the models run locally as well. I do know you can do subagents in Pi (probably others too), but the vast majority of people are going to hit hardware limitations trying to run them in parallel on local hardware.

I'm doubtful Fable's harness is unique in some way that you can't replicate with Pi. I'm mostly doubtful there are more than a handful of people with hardware sitting in their house that can execute more than one meaningfully smart model at a time.

If you're on local hardware, Deepseek v4 Flash is in the ballpark of 180GB of VRAM alone. Even on smaller models, Qwen + a dumber agent to execute is probably in the realm of 60GB of VRAM.

I do suspect you could get Deepseek to do Fable level things with a good harness (or a bunch of models really, I'm fairly convinced the magic of Fable is in the harness rather than the model).