Comment by aurareturn

6 days ago

6 days of work to do this. Even if it doesn't end up becoming meaningful, it shows just how tokens and work done will be linked now and in the future.

It's going to be hard to compete with someone or a company that has more compute. They will just be able to do things you can't.

Translating a project that includes a good test suite from one language to another is known to be a great case where LLMs work well.

When you’re starting with a complete codebase to use as an example and a test suite to check everything it’s much easier to iterate toward the desired goal. The LLM can already see what the goals are and how they’ve been implemented once already, which is a much easier problem than starting from a spec.

  • Great case where rust works well too. I won't cite every famous libs that got rewritten in rust but it wasn't all with LLM.

    • I fail to think of a successful Rust rewrite, so far what I've seen is just programmers who aren't sufficiently experienced, who decide to pick Rust and rewrite something in it, and then (this is the bad part) claim it's better for that reason only. It never is. It's always worse, because rewrites fundamentally end up with a worse product first.

      2 replies →

  • Sure, but, given that, does it not seem like the conclusion is: if you have something that could in principle be reverse engineered by a competitor with more compute, they can and will steal it, because the only constraint is roi.

  • The goal posts are always moving. This would have been an unthinkable task a couple years ago.

Unclear. Very good products tend to be about doing one or a few things very well; not about doing tons of stuff. So far, all I see is “Man, Im a 10x engineer now!”, shipping more code but without clear direction and taste. At this point, most of LLM-based work is just noise.

You could have said the same thing about steam power or electricity. And it’s not just an analogy: The magic of these things is in being universal information engines. You spend capital to build them, using well-understood, scalable techniques, plug them into electricity, and out comes value.

My point is, there’s no chance of a “haves and have nots” emerging, any more than electricity turned out that way in the modern world.

  • Electricity might be a good analogy - but for the other side of this argument.

    In the US, (nearly) full electrification wasn't achieved until the late 1940's/early 1950's - a process of nearly a century. (A moment of personal trivia, my great grandfather worked on crews electrifying rural areas of the midwest.)

    • We already have SOTA local inference devices in everyone’s pocket, which also provide high bandwidth access to SOTA data center inference at what is rapidly becoming commodity pricing.

      What comparable gap is there to bridge?

It's a new era of capital, literally, in software development. Ownership of the means of production is now concentrated.

Nah. These agents are getting easier and easier to run local. Have you tried Qwen 3.6 27b? It’s insane what it can do compared to its size. Like 100% vibe small projects if you manage context properly.

These models are a race to the bottom just like compute.

  • I don’t think it matters. Local matters becoming better has not stopped demand for SOTA models.

    • My guess is it won’t be worth it to focus specifically on coding models once local small models work just as well or within range. That will naturally close the gap even more

I can't help but wonder what this cost in USD assuming you paid standard rates from Anthropic. Can someone even ballpark the price?

  • Much less than what it’d costs for a team of rust engineers.

    This is both amazing and scary; has been for a while now.

    • It costs several times what it would cost a small team of engineers, even assuming you gave the engineers more time to do it. I'm guessing (wildly) this was around 0.5M USD in compute time. You do get the result quicker, though.

      6 replies →

  • 10k lines ~$250 in OpenAI API calls (no plan)

    45 million lines would get to ~$1.125 mil for the linux kernel.

    950k lines for Bun would get to $23,750

    use whatever math you like ofc.

    Does an Anthropic/employee pay that, no. Even if it's at a loss in terms of company revenue, it's worth burning the private capital for all kinds of other reasons.

There are not many companies that live off taking full test suite made over decades and just generating code off it.