Comment by tarruda

1 year ago

> What is a bit weird about AI currently is that you basically always want to run the best model,

I think the problem is thinking that you always need to use the best LLM. Consider this:

- When you don't need correct output (such as when writing a blog post, there's no right/wrong answer), "best" can be subjective.

- When you need correct output (such as when coding), you always need to review the result, no matter how good the model is.

IMO you can get 70% of the value of high end proprietary models by just using something like Llama 8b, which is runnable on most commodity hardware. That should increase to something like 80% - 90% when using bigger open models such as the newly released "mistral small 3"

3 comments

tarruda

lukeschlather 1 year ago

With o1 I had a hairy mathematical problem recently related to video transcoding. I explained my flawed reasoning to o1, and it was kind of funny in that it took roughly the same amount of time to figure out the flaw in my reasoning, but it did, and it also provided detailed reasoning with correct math to correct me. Something like Llama 8b would've been worse than useless. I ran the same prompt by ChatGPT and Gemini, and both gave me sycophantic confirmation of my flawed reasoning.

> When you don't need correct output (such as when writing a blog post, there's no right/wrong answer), "best" can be subjective.

This is like, everything that is wrong with the Internet in a single sentence. If you are writing a blog post, please write the best blog post you can, if you don't have a strong opinion on "best," don't write.

rblatz 1 year ago

This isn’t he best comment I’ve seen on HN, you should delete it, or stop gatekeeping.

lurking_swe 1 year ago

for coding insights / suggestions as you type, similar to copilot, i agree.

for rapidly developing prototypes or working on side projects, i find llama 8b useless. it might take 5-6 iterations to generate something truly useful. compared to say 1-shot with claude sonnet 3.5 or open ai gpt-4o. that’s a lot less typing and time wasted.