Comment by k__

8 months ago

Yes, often you see huge gains in some benchmark, then the model is ran through Aider's polyglot benchmark and doesn't even hit 60%.

0 comments