Comment by simonw

1 month ago

I nodded furiously at this bit:

> The hard part of computer programming isn't expressing what we want the machine to do in code. The hard part is turning human thinking -- with all its wooliness and ambiguity and contradictions -- into computational thinking that is logically precise and unambiguous, and that can then be expressed formally in the syntax of a programming language.

> That was the hard part when programmers were punching holes in cards. It was the hard part when they were typing COBOL code. It was the hard part when they were bringing Visual Basic GUIs to life (presumably to track the killer's IP address). And it's the hard part when they're prompting language models to predict plausible-looking Python.

> The hard part has always been – and likely will continue to be for many years to come – knowing exactly what to ask for.

I don't agree with this:

> To folks who say this technology isn’t going anywhere, I would remind them of just how expensive these models are to build and what massive losses they’re incurring. Yes, you could carry on using your local instance of some small model distilled from a hyper-scale model trained today. But as the years roll by, you may find not being able to move on from the programming language and library versions it was trained on a tad constraining.

Some of the best Chinese models (which are genuinely competitive with the frontier models from OpenAI / Anthropic / Gemini) claim to have been trained for single-digit millions of dollars. I'm not at all worried that the bubble will burst and new models will stop being trained and the existing ones will lose their utility - I think what we have now is a permanent baseline for what will be available in the future.

13 comments

simonw

thisoneisreal 1 month ago

The first part is surely true if you change it to "the hardEST part," (I'm a huge believer in "Programming as Theory Building"), but there are plenty of other hard or just downright tedious/expensive aspects of software development. I'm still not fully bought in on some of the AI stuff—I haven't had a chance to really apply an agentic flow to anything professional, I pretty much always get errors even when one-shotting, and who knows if even the productive stuff is big-picture economical—but I've already done some professional "mini projects" that just would not have gotten done without an AI. Simple example is I converted a C# UI to Java Swing in less than a day, few thousand lines of code, simple utility but important to my current project for <reasons>. Assuming tasks like these can be done economically over time, I don't see any reason why small and medium difficulty programming tasks can't be achieved efficiently with these tools.

cmrdporcupine 1 month ago

Indeed, while DeepSeek 3.2 or GLM 4.7 are not Opus 4.5 quality, they are close enough that I could _get by_ because they're not that far off, and are about where I was with Sonnet 3.5 or Sonnet 4 a few months ago.

I'm not convinced DeepSeek is making money hosting these, but it's not that far off from it I suspect. They could triple their prices and still be cheaper than Anthropic is now.

nrhrjrjrjtntbt 1 month ago

Hardest part of programming is knowing wtf all the existing code does and why.

doug_durham 1 month ago
And that is the super power of LLMs. In my experience LLMs are better a reading code than writing it. Have it annotate some code for you.
- elboru 1 month ago
  
  Still, code describes what and how, but not why.
- nrhrjrjrjtntbt 1 month ago
  
  I do!

omnicognate 1 month ago

> claim to have been trained for single-digit millions of dollars

Weren't these smaller models trained by distillation from larger ones, which therefore have to exist in order to do it? Are there examples of near state of the art foundation models being trained from scratch in low millions of dollars? (This is a genuine question, not arguing. I'm not knowledgeable in this area.)

simonw 1 month ago

The DeepSeek v3 paper claims to have trained from scratch for ~$5.5m: https://arxiv.org/pdf/2412.19437
Kimi K2 Thinking was reportedly trained for $4.6m: https://www.cnbc.com/2025/11/06/alibaba-backed-moonshot-rele...
Both of those were frontier models at the time of their release.
Another interesting number here is Claude 3.7 Sonnet, which may people (myself included) considered the best model for several months after its release and was apparently trained for "a few tens of millions of dollars": https://www.oneusefulthing.org/p/a-new-generation-of-ais-cla...

boogieknite 1 month ago

maybe not the MOST valuable part of prompting an LLM during a task, but one of them, is defining the exact problem in precise language. i dont just blindly turn to an LLM without understanding the problem first, but i do find Claude is better than a cardboard cutout of a dog

underdeserver 1 month ago

Aren't they also losing money on the marginal inference job?

simonw 1 month ago
I think it is very unlikely that they are charging less money for tokens than it costs them to serve those tokens.
If they are then they're in trouble, because the more paying customers they get the more money they lose!
- mslt 1 month ago
  
  Operating at a loss to buy market share is pretty much the norm at this point. Look behind the curtain at any “unicorn” for the past 3 decades and you’ll see VCs propping up losses until the general population has grown too dependent on the service to walk away when the pricing catches up to reality.
- anonzzzies 1 month ago
  
  I guess that depends on the user; most people are not getting most out of flat priced subscriptions. Over all they probably make a profit, and definitely on API use, but some will just spend a lot more. It'll get cheaper though; they are still acquiring as long as there is VC money.