Comment by augusteo
11 hours ago
This maps to what we've seen building AI at work.
When we started building a voice agent for inbound calls, the models were close but not quite there. We spent months compensating for gaps: latency, barge-in handling, understanding messy phone audio. A lot of that was engineering around model limitations.
Then the models got better. Fast. Latency dropped. Understanding improved. Suddenly the human-in-the-loop wasn't compensating, it was enhancing.
The shift was noticeable. We went from "how do we work around this limitation" to "how do we build the best experience on top of this capability." That's MMF in practice.
The timing question is real though. We started building before MMF fully existed for our use case. Some of that early work was throwaway. Some of it became the foundation. Hard to know in advance which is which.
The danger is that we bridge that gap with backend complexity. I spent weeks over-engineering a chain of evaluators and retries to get reliable outputs from cheaper models, thinking I was optimizing margins.
Then a smarter model dropped that handled the nuance zero-shot. That sophisticated orchestration layer immediately became technical debt—slower and harder to maintain than just swapping the API endpoint.
Whatever we do now to "steer" the model to do the job, my 5 cents, it will all get sucked into the model itself; skate where the puck is going as they say, and relentlessly focus on user experience and the overall product, that's how you get something like Granola.