Comment by binsquare

2 years ago

I'm surprised to see perplexity's 70B online model score so low on model quality and somehow far worse mixtral and gpt3.5(they use a fine tuned gpt3.5 as the foundational model AFAIK)

I run https://www.labophase.com and my data suggests that it's one of the top 3 models in terms of users liking to interact with it. May I know how model quality is benchmarked to understand this discrepancy?

1 comment

binsquare

Gcam 2 years ago

Model quality index methodology is as per this comment (can add perplexity using the dropdown): https://news.ycombinator.com/item?id=39014985#39017632

It's a combination of different quality metrics which have Perplexity, overall, not performing as well. That being said, I think we are in the very early stages of model quality scoring/ranking - and (for closed sourced models) we are seeing frequent changes. Will be interesting to see how measures evolve / model ranks change