Comment by JustFinishedBSG
3 months ago
I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.
You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.
3 months ago
I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.
You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.
No comments yet
Contribute on Hacker News ↗