← Back to context

Comment by 1dom

21 days ago

Hi! Thanks for the response. Like I mentioned, I only skimmed, and it sounds like there's more to it than I understand, so I'll take a deeper look and see how it feels in practice.

> Where timing gets interesting is that forge will slow down workflows because the retries mean you don't error right away. Bare runs were failing fast in my experience. But on a per-call basis there's very little overhead.

> I haven't detailed it simply because the order of magnitude of a single LLM call is so much higher than all the overhead put together.

Yeah, that makes sense and seems fair. The sort of delays are almost and inevitability, you're not trying to improve speed, but by improving reliability, it can obviously increase overall throughput.

Having watched the demo video too now, automating retries etc would be helpful for me. It's impressive to see how quick the models run on better hardware, and the performance improvements are impressive, even if the overall run takes longer sometimes because it does more correct things. Thanks again!