Comment by jdietrich

1 month ago

Benchmark scores are table stakes - necessary but not sufficient to demonstrate the capabilities of a model. Casual observers might just look at the numbers, but anyone spending real money on inference will run their own tests on their own problems. If your model doesn't perform as it should, you will be found out very quickly.