Comment by vasco

10 hours ago

But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback / review mechanisms or having to babysit it prompt by prompt.

By the time gemma6 allows you to do the above the proprietary models supposedly will already be on the next step change. It just depends if you need to ride the bleeding edge but specially because it's "intelligence", there's an obvious advantage in using the best version and it's easy to hype it up and generate fomo.

1 comment

vasco

oblio 10 hours ago

> But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback

Do people actually build meaningful things like that?

It's basically impossible to leave any AI agent unsupervised, even with an amazing harness (which is incredibly hard to build). The code slowly rots and drifts over time if not fully reviewed and refactored constantly.

Even if teams of agents working almost fully autonomously were reliable from a functional perspective (they would build a functional product), the end product would have ever increasing chaos structurally over time.

I'd be happy to be proven wrong.