← Back to context Comment by esafak 3 months ago Yes, of course. 2 comments esafak Reply uejfiweun 3 months ago Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble. JustFinishedBSG 3 months ago I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.
uejfiweun 3 months ago Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble. JustFinishedBSG 3 months ago I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.
JustFinishedBSG 3 months ago I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.
Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble.
I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.
You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.