Comment by j_maffe
1 year ago
But that's the thing. The only way to truly find out if it's reliable (>90%) is to check the data yourself.
1 year ago
But that's the thing. The only way to truly find out if it's reliable (>90%) is to check the data yourself.
This is why metrics and leaderboards like these are so important (but under reported on): https://github.com/vectara/hallucination-leaderboard https://www.kaggle.com/facts-leaderboard
Google Gemni models seem to lead...hopefully the metrics aren't being gamed.