← Back to context Comment by jampekka 3 months ago 1491 vs 1418 ELO means the stronger model wins about 60% of the time. 4 comments jampekka Reply supermatt 3 months ago Probably naive questions:Does that also mean that Gemini-3 (the top ranked model) loses to mistral 3 40% of the time?Does that make Gemini 1.5x better, or mistral 2/3rd as good as Gemini, or can we not quantify the difference like that? esafak 3 months ago Yes, of course. uejfiweun 3 months ago Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble. 1 reply →
supermatt 3 months ago Probably naive questions:Does that also mean that Gemini-3 (the top ranked model) loses to mistral 3 40% of the time?Does that make Gemini 1.5x better, or mistral 2/3rd as good as Gemini, or can we not quantify the difference like that? esafak 3 months ago Yes, of course. uejfiweun 3 months ago Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble. 1 reply →
esafak 3 months ago Yes, of course. uejfiweun 3 months ago Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble. 1 reply →
uejfiweun 3 months ago Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble. 1 reply →
Probably naive questions:
Does that also mean that Gemini-3 (the top ranked model) loses to mistral 3 40% of the time?
Does that make Gemini 1.5x better, or mistral 2/3rd as good as Gemini, or can we not quantify the difference like that?
Yes, of course.
Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble.
1 reply →