Comment by stijntonk

18 hours ago

I wish that Google would focus on bringing their Gemini 3.x models to GA, and provide enough capacity such that one not constantly has to fight with 429 errors.

It often feels like they do not want me to develop applications for corporate clients using their Vertex API. It is just such a shame, given that their models were so great for document analysis etc.

Are you doing it on a free plan? I noticed they serve way more 429s on the free plan.

  • No, for clients we use paid Vertex AI accounts. We often need to host workloads in an EU region, which rules out “global” models (and probably better capacity).

    In the past, we used a wrapper that round-robined across multiple projects to get enough quota. Luckily, many of our workloads are workflow-style tasks, so we can simply keep retrying on 429s.

    Fun fact: for one of their services, I think it was Stitch, I noticed that my paid key kept hitting quota, while the free worked fine. That blew my mind.

    • I've been seeing the same in my product; 429s in vertex.

      We generally avoid any Google AI for the most part because it's so unreliable.