Comment by Stitch4223

9 hours ago

It’s four poorly constructed arbitrary experiments which say very little about the competency of either model.

The article reads like thin, auto-generated ai clickbait for nerd sniping or shilling a model.

Consider the lead:

> DeepSeek V4 Pro wins this head-to-head by being more exact where it matters: following instructions, matching schemas, and solving edge cases cleanly. GPT-5.5 Pro is still strong, but it gave away points with avoidable deviations.

“where it matters”, “cleanly”, “is still strong”, and vague references instead of telling 3 out of 4 tests Deepseek yielded more concise results.

1 star.

(Three out of) four experiments is anecdotal for sure, but the result meshes with more established instruction following benchmarking (although DeepSeek V4 pro does not top these): https://artificialanalysis.ai/evaluations/ifbench

I found the writing clear and quite even handed. The lead is a bit salesy, but leads typically are. Knee-jerk dismissals based on vibes that something is LLM generated are quite low-effort.

  • It's picking strange tasks that don't really play to GPT-Pro's strengths (that model is roughly comparable to Mythos, intended for very hard reasoning and research-level problems) and then completely ignoring quite a few cases where GPT-Pro actually got some things more correct than DeepSeek did. The auto-AI ranking is just not reliable for this stuff.

I think you've misunderstood the purpose of a lead (sic).

Per Merriam-Webster [^1], a lede is:

> the introductory section of a news story that is intended to entice the reader to read the full story

(Emphasis mine)

You may prefer more matter-of-fact phrasing, of course, but criticising a lede for attempting to achieve its goal is unjustified.

[^1]: https://www.merriam-webster.com/dictionary/lede

  • A 'lede' is just an intentionally differentiated spelling of 'lead'; the origin of the word is just lead. Collins dictionary defines lede: a variant spelling of lead

  • I think the criticism is less about whether the lede is good at achieving its goal and more about whether that goal is honorable in the first place.

    So dismissing it on technicalities is for sure clever but also obvious and lame.

    The Letter/spirit thing eventually got boring. Please find better material

    • I apologise if using words correctly is obvious and lame.

      GP is explicitly criticising the language in the lede as being unsuitably vague, hence my reply.

      As to the goal of the article, I fail to see what is dishonourable about comparing LLMs. You may consider the methodology flawed, but it's a perfectly respectable goal.

      Sorry, was that another technicality? I'll try to find better material, just for you.

  • It’s the hardest part of an article if you ask me.

    Filling it with slop constructs signals the reader no effort was made writing the article. So no effort should be put into reading it.

    The rest of the article is equally flimsy. Great clickbait title, perhaps that is even harder than writing a lede.

    I am not a native speaker :)