← Back to context

Comment by meetpateltech

11 hours ago

Karpathy will start this week on Anthropic's pre-training team, which is responsible for the massive training runs that give Claude its core knowledge and capabilities, according to Anthropic.

Source: https://www.axios.com/2026/05/19/anthropic-openai-karpathy-a...

Specifically it looks like he's planning to extend the ideas from https://github.com/karpathy/autoresearch into a larger effort towards recursive training improvement [1]:

> Excited to welcome Andrej to the Pretraining team! He'll be building a team focused on using Claude to accelerate pretraining research itself. I can’t think of anyone better suited to do it — looking forward to what we build together!

[1] https://x.com/nickevanjoseph/status/2056760504949842219

  • Am I the only one who wasn’t particularly impressed by AutoResearch? If you looked at what the agent was actually doing, it was just tuning parameters mostly, not really trying different novel approaches.

    I couldn’t help myself but consider this mostly a very inefficient variant of hyperparameter optimization, but someone correct me if I’m wrong, I may be looking at this too pessimistic.

    • I didn't dig into what the actual repository was doing, but personally, I took some inspiration from the idea after reading about it and realizing that I might have been underestimating the ability of LLMs. I put a bit more work into a performance harness I was using locally and just set some agents to brainstorming and they did seem to find some great stuff. So I don't really have a stance one way or another on this specific repo, but the general idea seems like a really good one.

    • Karpathy embedded within an organization is way more impressive than him out on his own with hot takes and little projects. I hope he does great things for Anthropic.

      1 reply →

    • Inefficient variants with $100m+ worth of compute will still probably outperform the best team of researchers

  • I guess we must expect it at this point. But funny that has model written tokens like ’ instead of '

  • More like he'll blog and tweet about using Claude and get gullible software engineers to buy Claude subscriptions and work on their own obsolescence while paying for it.

    Many people are still deluded and think he is the same person who wrote the informal AI tutorials in plain html. He isn't, he is selling stuff now.

    • I'm as jaded as can be but I think Anthropic is now beyond the point where they'd place much value on farming Karpathy's name recognition. I'm sure they considered it an extra plus in his hiring package but they wouldn't do the level of comp package he'd want if they didn't believe the odds were decent that he'll contribute serious value.

      Sure, it can always not work out but that's no more a risk with him than any high-profile hire who doesn't really need the money and will always have other options.

    • What is he selling? How is this time different compared to when he was at OpenAI or at Tesla? You could say he was shilling those products too. I don't see any shift. He's still posted free in depth YouTube videos recently.

      4 replies →

This is good branding move for Anthropic. Karpathy is well respected among ML crowd.

  • Speaking of, how did he not lose credibility at “full self driving next year, better buy it now”-Tesla?

    It might be Elon who went and said that and said they don’t need lidar, but as director of AI and auto vision Karpathy bears the responsibility for those features.

    • >Speaking of, how did he not lose credibility at “full self driving next year, better buy it now”-Tesla?

      That I also want to know. He bailed out of Tesla right when the limitations of his "LIDAR-less cameras only self driving" system were becoming obvious, and nobody asked him about the hindsight of this obvious fuckup.

      >but as director of AI and auto vision Karpathy bears the responsibility for those features.

      Exactly. You lead the R&D, so it's on you. If your boss makes stupid decisions in public overriding your best judgement, the leave and go somewhere where your decisions be respected. The ML market was red hot for people like him back then so it's not like he didn't have alternatives.

      Although I doubt Elon forced that idea on him, since he's the one who was confidently claiming that vision only is better since Lidar pollutes the sensor fusion data.

      3 replies →

Why do they need this when they have the next gen mythos? Surely that can manage everything?

  • You don't understand: no ones ever reading more than 1% of the training material; so they need someone who has reduced that to 0.1%. The less you know, the more you know!