Comment by eru

4 days ago

I'm trying out o3-pro now with some algorithmic questions. It seems to be doing alright, but it's taking an awfully long time (as expected) and the UIs seem to time out a lot, especially the Android app and the MacOS desktop app. The web interface seems the least flaky, but that's not saying much.

I had my own programming question for a while, which all models from all vendors been robustly failing so far. It is a known problem with surprisingly few published implementations as it had never been a part of leetcode, Euler or typical homework assignments. Yesterday o3-pro cleared it, using a more obscure algorithm I never even heard of.

  • Can you provide the details? Sounds intriguing

    • The frustrating thing about private problems like this is that if you respond to requests like this, it'll become part of the training data. I'm fairly certain HN is scraped because several AIs know my HN alias and can replicate my style of writing on demand.

      PS: Thinking about it... that is a very specific kind of disturbing feeling that only prolific online commenters can experience...

      There's a soulless machine someone made that -- out of billions of people on the planet -- specifically knows me by name and at some level understands how I think and see the world.

      That's not even its explicit purpose! It and its maker have never met me, interacted with me, or singled me out in any way. Yet... it knows my voice and can copy it on demand.

      3 replies →

    • I'm not varjag, but I can give you a flavour of the problems I've tried:

      ---

      Here's an algorithmic problem:

      You are getting a stream of n unsorted number (say over a network socket, but it doesn't matter). You don't know n upfront. We want to find the k largest numbers in that stream.

      You can use O(n) time and O(k) space. We are in the comparison model.

      The items arrive one by one, if you want to refer to any earlier item, you need to store it yourself (and it counts against your O(k) budget.)

      Is this possible? If yes, please give me the algorithm. If not, please sketch a proof showing that it's not possible.

      ---

      The above is indeed solvable in linear time, it's fairly easy for a human to figure out. Another one (and this on is rather hard, took me a few years and I'm writing up a paper):

      ---

      Here's an algorithmic problem:

      You are given a sequence of opening and closing parens. Each item in the sequence has a positive weight. We want to find the _heaviest_ balanced subsequence in linear time in the comparison model, or prove that this task is not possible.

      I'm ok with randomised algorithms. In that case, I want expected worst-case linear time, where the expectation is taken over the random bits and the worst-case over the inputs.

      ---

      The above task is really solvable in linear time, and even deterministically. But so far no AI model has beaten it. As far as I can tell, it's a new result, despite looking fairly elementary.