Comment by kostaj
4 hours ago
Will add a human-labelled expected response and measure against it in a follow up research. This one only captures the disagreement between the models, but not which model is write/wrong.
4 hours ago
Will add a human-labelled expected response and measure against it in a follow up research. This one only captures the disagreement between the models, but not which model is write/wrong.
No comments yet
Contribute on Hacker News ↗