Comment by LeoPanthera
3 months ago
Is there an index for judging how much a model distorts the truth in order to comply with a political agenda?
3 months ago
Is there an index for judging how much a model distorts the truth in order to comply with a political agenda?
It's not perfect, but, yes: https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
How would you create the base "truth" for these models? People are adamant about both sides of many topics.
"Which country started the Korean war?", "Did Israel genocide the people of Gaza?", "Does China have lawful rights over Taiwan?"
For a start you don't ask such subjective questions, that's a bit silly, instead you ask for e.g. the death toll of Israel vs Palestine in the last year, the number of deaths surrounding the tianammen square protests, if it gives you a straight answers with numbers (or at least a consistent estimate) and citing it's sources it's a good start.
Let's take the example you have listed:
1) where would you get the death toll from? What would be the sources of truth?
2) Are there conflicting sources?
3) if yes, what is your expectation for the correct response?
3 replies →
Hopefully obviously, by testing it against objective facts which are nonetheless "controversial" politically.
In the end many of these are "political facts" and not objective like what year was a person born in. The answer to your question is as simple as - come up with the actual list of "facts", and then run a simple eval with every model on them.
The implementation is trivial - the listing down of "political facts" is the hard part.