Comment by roan-we
3 days ago
I had developed a side project with AI agents to help me summarize the research papers and extract key citations, and I was repeatedly hitting the same annoying pattern. I would finetune everything with GPT4 to perfection, and then in a couple of weeks, it would start hallucinating references or missing citations. I used to waste my Saturday mornings changing prompts and switching models instead of really using the thing.
Kalibr pretty much freed me from that loop.
I basically arranged GPT-4 and Claude as two different routes, explained that success means accurate citations that I can verify, and now it just works.
Last week, GPT-4 oddly started being very slow on longer papers, and by the time I realized it, the traffic was already automatically diverted to Claude.
It's like the difference between caretaking an agent and actually having a tool that remains functional without constant supervision.
Honestly, I wish I had discovered this a few months ago hehe
This made my day. Exactly the use case we had in mind. Really glad it's working for you, and that GPT-4 slowdown story is a perfect example of why canary traffic matters. Thanks for sharing this.