← Back to context

Comment by jacquesm

8 days ago

That's a nice example, can you explain your 'one shot' setup in some more detail?

I don't have the prompt, but I used codex. I probably wrote a medium sized paragraph explaining the architecture. It scaffolded out the app, and I think I prompted it twice more with some very small bugfixes. That got me to an MVP which I used to build LaTeX pipelines. Since then, I've added a few features out as I've dogfooded it.

It's a bit challenging / frustrating to get LLMs to build out a framework/library and the app that you're using the framework in at the same time. If it hits a bug in the framework, sometimes it will rewrite the app to match the bug rather than fixing the bug. It's kind of a context balancing act, and you have to have a pretty good idea of how you're looking to improve things as you dogfood. It can be done, it takes some juggling, though.

I think LLMs are good at golang, and also good at that "lightweight utility function" class of software. If you keep things skeletal, I think you can avoid a lot of the slop feeling when you get stuck in a "MOVE THE BUTTON LEFT" loop.

I also think that dogfooding is another big key. I coded up a calculator app for a dentist office which 2-3 people use about 25 times a day. Not a lot of moving parts, it's literally just a calculator. It could basically be an excel spreadsheet, except it's a lot better UX to have an app. It wouldn't have been software I'd have written myself, really, but in about 3 total hours of vibecoding, I've had two revisions.

If you can get something to a minimal functional state without a lot of effort, and you can keep your dev/release loop extremely tight, and you use it every day, then over time you can iterate into something that's useful and good.

Overall, I'm definitely faster with LLMs. I don't know if I'm that much faster. I was probably most fluent building web apps in Django, and I was pretty dang fast with that. LLMs are more about things like "How do you build tests to prevent function drift" and "How can I scaffold a feedback loop so that the LLM can debug itself".

  • I like your pragmatic attitude to all this.

    I think your prompts are 'the source' in a traditional sense, and the result of those prompts is almost like 'object code'. It would be great to have a higher level view of computer source code like the one you are sketching but then to distribute the prompt and the AI (toolchain...) to create the code with and the code itself as just one of many representations. This would also solve some of the copyright issues, as well as possibly some of the longer term maintainability challenges because if you need to make changes to the running system in a while then the tool that got you there may no longer be suitable unless there is a way to ingest all of the code it produced previously and then to suggest surgical strikes instead of wholesale updates.

    Thank you for taking the time to write this all out, it is most enlightening. It's a fine line between 'nay sayer' and 'fanboi' and I think you've found the right balance.

    • Thanks for reading it! I didn't use an LLM, lol.

      On documentation, I agree with you, and have gone done the same road. I actually built out a little chat app which acts as a wrapper around the codex app which does exactly this. Unfortunately, the UI sucks pretty bad, and I never find myself using it.

      I actually asked codex if it could find the chat where I created this in my logs. It turns out, I used the web interface and asked it to make a spec. Here's the link to the chat. Sorry the way I described wasn't really what happened at all! lol. https://chatgpt.com/share/69b77eae-8314-8005-99f0-db0f7d11b7...

      As it happens, I actually speak-to-texted my whole prompt. And then gippity glazed me saying "This is a very good idea". And then it wrote a very, very detailed spec. As an aside, I kind of have a conspiracy theory that they deploy "okay" and "very very good" models. And they give you the good model based on if they think it will help sway public opinion. So it wrote a pretty slick piece of software and now here I am promoting the LLM. Oof da!

      I didn't really mention - spec first programming is a great thing to do with LLMs. But you can go way too far with it, also. If you let the LLM run wild with the spec it will totally lose track of your project goals. The spec it created here ended up being, I think, a very good spec.

      I think "code readability" is really not a solved problem, either pre or post LLM. I'm a big fan of "Code as Data" static analysis tools. I actually think that the ideal situation is less of "here is the prompt history" and something closer to Don Knuth's Literate Programming. I don't actually want to read somebody fighting context drift for an hour. I want polished text which explains in detail both what the code does and why it is structured that way. I don't know how to make the LLMs do literate programming, but now that I think about it, I've never actually tried! Hmmm....