← Back to context

Comment by n2d4

2 years ago

> and that is assuming their model is not overtrained to just pass the tests slightly better than GPT4

You are assuming GPT4 didn't do the exact same!

Seriously, it's been like this for a while, with LLMs any benchmark other than human feedback is useless. I guess we'll see how Gemini performs when it's released next year and we get independent groups comparing them.