Comment by quirino

5 days ago

I think equally impressive is the performance of the OpenAI team at the "AtCoder World Tour Finals 2025" a couple of days ago. There were 12 human participants and only one did better than OpenAI.

Not sure there is a good writeup about it yet but here is the livestream: https://www.youtube.com/live/TG3ChQH61vE.

9 comments

quirino

zeroonetwothree 5 days ago

And yet when working on production code current LLMs are about as good as a poor intern. Not sure why the disconnect.

kenjackson 5 days ago
Depends. I’ve been using it for some of my workflows and I’d say it is more like a solid junior developer with weird quirks where it makes stupid mistakes and other times behaves as a 30 year SME vet.
- Rioghasarig 4 days ago
  
  I really doubt it's like a "solid junior developer". If it could do the work of a solid junior developer it would be making programming projects 10-100x faster because it can do things several times faster than a person can. Maybe it can write solid code for certain tasks but that's not the same thing as being a junior developer.
  
  1 reply →
roxolotl 4 days ago

It’s the same reason leet code is a bad interview question. Being good at these sorts of problems doesn’t translate directly to being good at writing production code.
riku_iki 5 days ago
because competitive coding is narrow well described domain(limited number of concepts: lists, trees, etc) with high volume of data available for training, and easy way to setup RL feeback loop, so models can improve well in this domain, which is not true about typical enterprise overbloated software.
- quirino 5 days ago
  
  All you said is true. Keep in mind this is the "Heuristics" competition instead of the "Algorithms" one.
  Instead of the more traditional Leetcode-like problems, it's things like optimizing scheduling/clustering according to some loss function. Think simulated annealing or pruned searches.
  
  2 replies →