Comment by lelanthran

1 month ago

> It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.

It's only in the interests of the model builders to do that IFF the user can actually tell that the model is giving them the best value for a single dollar.

Right now you can't tell.

4 comments

lelanthran

fragmede 1 month ago

Why not? Seems like you'd just build the same app on each of the models you want to test and judge how they did.

lelanthran 1 month ago
> Why not? Seems like you'd just build the same app on each of the models you want to test and judge how they did.
I tried that on a few problems; even on the same model the results have too much variation.
When comparing different models, repeating the experiment gives you different results.
- hnuser40690 25 days ago
  
  Interesting point about model variation. It would be useful to run multiple trials and look at the statistical distribution of results rather than single runs. This could help identify which models are more consistent in their outputs.
  
  1 reply →