Comment by benterix

1 year ago

From the article:

> I expect to continue mostly using GPT-4o (and Claude 3.5 Sonnet)

I saw similar comments elsewhere and I'm stunned - am I the only one who considers 4o a step back when compared to 4 for textual input and output? It basically gives fast semi-useful answers that seem like a slightly improved 3.5.

I use gpt4-o mostly, but your specific use-case might have a big impact here: 4o is very likely a distilled model, meaning that it has fewer weights and can thus run much faster on the same hardware. If that is the case, it's general world knowledge must be less comprehensive by default. But it retained the strong reasoning capabilities of 4 through distillation and drastically improved on external tool use and vision. It also offers a much bigger context window. So if you're using it to automate complex tasks in your job that depend a lot on additional information that it hasn't seen during training, 4o is the obvious choice. If you're just using it as a search engine, you should probably stick with 4 for now.

I wholly agree with you. I've been using every model extensively since early the Davincis and I strongly believe that gpt-4-0314 was the best model they've released to date.

It's poor performance on benchmarks drives my skepticism of LLM benchmarking in general. I trust my feel for the models much more, and my feel was that 0314 was great.

The one thing that 0314 doesn't do well are the tricks like structured output and tool calling which makes it a less useful agentic type of tool, but from a pure thinking perspective, I think it's the best.

  • That's my concern - they marked 4 as "legacy" in the GUI, and now they hid it temporarily under a submenu - but it's the only model I care about. If they remove it, there is no reason for me to use their services, especially with Claude 3.5 wider context window and reasonably good results.