Comment by nl

4 hours ago

I don't think anyone is claiming they can one-shot a 50k line SAAS app.

I think you'd get close on something like Lovable but that's not really one shot either.

But re-reading your statement you seem to be claiming that there are no 50k SAAS apps that are build even using multi-shot techniques (ie, building a feature at a time).

In that case my Vibe-Prolog project would count: https://github.com/nlothian/Vibe-Prolog/

  - It's 45K of python code
  - It isn't a duplicate of another program (indeed, the reason it isn't finished is because it is stuck between ISO Prolog and SWI Prolog and I need to think about how to resolve this, but I don't know enough Prolog!)
  - Not a *single* line of code is hand written. 

Ironically this doesn't really prove that the current frontier models are better because large amounts of code were written with non-frontier models (You can sort of get an idea of what models were used with the labels on https://github.com/nlothian/Vibe-Prolog/pulls?q=is%3Apr+is%3...)

But - importantly - this project is what convinced me that the frontier models are much better than the previous generation. There were numerous times I tried the same thing in a non-Frontier model which couldn't do it, and then I'd try it in Claude, Codex or Gemini and it would succeed.