Comment by cycomanic
2 months ago
I was talking with a colleague the other day and we came to the conclusion that in our experience if you're using llms as a programming help models are really being optimised for the wrong things.
At work I often compare locallly run 4-30B models against various GPTs (we can only use non-local models for few things, because of confidentiality issues). While e.g. GPT-4o gives better results on average, the chances of it making parts of the response up is high enough that one has to invest significant amount to check and iterate over results. So the difference in effort is not much lower compared to the low parameter models.
The problem is both are just too slow to really iterate quickly, which makes things painful. I'd rather have a lower quality model (but with large context) that gives me near instant responses instead of a higher quality model that is slow. I guess that's not giving you the same headlines as the improved score on some evaluation.
No comments yet
Contribute on Hacker News ↗