Comment by danpalmer

12 hours ago

Yes. The models are good, the models are fast, and the internal tooling has caught up at this point too. There's a lot of UI/UX/tooling stuff that's still being worked through, integrations with VCS, and solving deeper problems that I probably can't talk about, but I'd say the frustrations of most are about the rate of change much more than the actual abilities.

One thing that's interesting is a bunch of internal thought leaders who swear by the Flash models over the Pro models. Whether this is true or not doesn't really matter, the interesting bit to me is that we are at a point with the models where "better" models are not necessarily more useful, and that faster with more work on the harnesses may be a better trade-off.

> a bunch of internal thought leaders who swear by the Flash models over the Pro models

I'm coming around on this too. deepseek-v4-flash is impressive.

>One thing that's interesting is a bunch of internal thought leaders who swear by the Flash models over the Pro models.

I've seen people outside Google favoring flash Gemini models over the Pro.

There are also some benchmarks where flash models have higher scores, so yes, apparently speed does matter.

You’re absolutely kidding yourself if you genuinely believe that.

  • Happy to chat internally if you want, feel free to reach out.

    I see a lot of people swearing by one model, but without trying others. I see a lot of opinions based on a snapshot of tooling from ~January, when for example Claude Code was exceptional, but that don't appear to have been updated. In blind tests the models appear to be much closer than some folks would have you believe.

    • I’ll admit it swings back and forth on a six month cycle or so; however, cost-to-output matters.

      Also, for niche use-cases there are clear winners.