Comment by Flux159

8 months ago

From the article it mentions that they use a single chat thread but randomly choose between 2 different models (w/ best results from Gemini 2.5 / Sonnet 4.0 right now).

Are there any library helpers for managing this with tool call support or is it just closed source / dependent on someone else to make open source inside a different library?

You can achieve this with LMStudio's UI to test it today. You can switch between different local models in the same chat context. You can also edit previous chat results to remove context-poisoning information.

LMStudio has an API, so it should be possible to hook into that with relatively little code.

It should be pretty simple to do, right? It shouldn't be that hard to abstract out tool calls.

  • I did this in about 400 or 500 lines of typescript with direct API calls into vertex AI (using a library for auth still). Supports zod for structured outputs (gemini 2.5 supports json schema proper, not just the openapi schemas the previous models did), and optionally providing tools or not. Includes a nice agent loop that integrates well with it and your tools get auto deserialized and strongly typed args (type inference in ts these days is so good). Probably could had been less if I had used googles genai lib and anthropic’s sdk - I didn’t use them because it really wasn’t much code and I wanted to inject auditing at the lowest level and know the library wasn’t changing anything.

    If you really want a library, python has litellm, and typescript has vercel’s AI library. I am sure there are many others, and in other languages too.

  • Its a godforsaken nightmare.

    There's a lotta potemkin villages, particularly in Google land. Gemini needed highly specific handholding. It's mostly cleared up now.

    In all seriousness, more or less miraculously, the final Gemini stable release went from like 20%-30% success at JSON edits to 80%-90%, so you could stop doing the parsing Aider edits out of prose.