Comment by wmf 2 days ago At least there shouldn't be any complaints about benchmaxing this time. 2 comments wmf Reply i_have_an_idea 2 days ago Just because it is performing rather poorly by comparison, it doesn’t mean it isn’t benchmaxxed. It can still be worse than it appears. wasabi991011 2 days ago It isn't benchmaxxed because they are using human preference as an evaluation.
i_have_an_idea 2 days ago Just because it is performing rather poorly by comparison, it doesn’t mean it isn’t benchmaxxed. It can still be worse than it appears. wasabi991011 2 days ago It isn't benchmaxxed because they are using human preference as an evaluation.
wasabi991011 2 days ago It isn't benchmaxxed because they are using human preference as an evaluation.
Just because it is performing rather poorly by comparison, it doesn’t mean it isn’t benchmaxxed. It can still be worse than it appears.
It isn't benchmaxxed because they are using human preference as an evaluation.