Comment by talles

1 year ago

Anyone else wants more articles on how those benchmarks are created and how they work?

Those models can be trained in way tailored to have good results on specific benchmarks, making them way less general than it seems. No accusation from me, but I'm skeptical on all the recent so called 'breakthroughs'.