Comment by jonathanhefner
8 hours ago
Thank you for sharing! Based on your experience, do you think a two-model system might fare better? For example, two models in serial where the second model is trained to "sniff out" potential hallucinations and fact check them (and possibly iterate with the first model)?
I do think it might improve but only marginally.
You are however likely to observe better results in smaller models since they're usually more strapped for "cognitive capacity", so two separate calls reduce the load in each request, and hallucination in my experience is a common side effect of overloading an LLM cognitively.