← Back to context

Comment by ChrisMarshallNY

1 month ago

> The longer you let it drive without constraints, the worse the wreckage gets. The velocity makes you think you're winning right up until the moment everything collapses simultaneously.

In my experience (so far), I can’t let the LLM write too much in one go.

I need to test the hell out of what it gives me, and I can’t ask for too much, at one time.

I tend to ask it to “flesh out” functions, where I have a signature, and a detailed headerdoc comment. I will provide a lot of guidance about the context, often attaching relevant files.

Even then, it often doesn’t give me what I need, first time, unless it’s a small function, with extremely limited scope.

That said, it’s been extremely helpful. It has accelerated my development greatly.

I have found that it gives me much better PHP, than Swift.

I suspect that may be because PHP is extremely mature, and there’s millions and millions of lines of high-quality code out there, in open-source repos, while Swift is probably mostly in closed repos, with open stuff not really provided by experienced developers (it’s a proprietary language used for shipping commercial software, so that may also apply to other languages).

What it gives me in Swift, most closely resembles stuff that enthusiastic newer folks would do, and want to show off.

> What it gives me in Swift, most closely resembles stuff that enthusiastic newer folks would do, and want to show off.

The same is true for rust-lang. Code that will immediately clone/re-allocate anything passed by reference and collect everything to the heap that is passed by `Iterator`/`IntoIterator`.

It is a massive performance anti-pattern and the hallmark of somebody "struggling" with the borrow checker. Naturally a lot of 1st & 2nd 'I just learned rust' projects lean on it. Which is totally fine for humans, you're learning. But with LLMs that pattern is now burned into their eigenvectors with the heat of a billion hours of H100 training time.

It has gotten to a point that all code I generate with Opus or Codex if there as iterator or reference in the argument, I start a fresh context, with a sort of `remove unnecessary clones, collections, and copies from the following code: {{code}}`

  • > It has gotten to a point that all code I generate with Opus or Codex if there as iterator or reference in the argument, I start a fresh context, with a sort of `remove unnecessary clones, collections, and copies from the following code: {{code}}`

    What does it do if you put "Avoid unnecessary clones, collections, and copies" in your CLAUDE.md/AGENTS.md?

    • It makes no difference at all.

      Edit: Opus prior to the context nerf it worked more often than not. Current Opus 4.7 is practically unusable.

> In my experience (so far), I can’t let the LLM write too much in one go.

Second, but I've found a cheat code to make it much farther with minimal intervention.

Step 1: tell them your goal, have them generate a doc, include design principals, system invariants, and acceptance criteria.

No amount of CLAUDE.md or skills beats re-iterating the focus points directly in the prompt.

Step 2: tell them to summarize the doc (pay close attention here). Have them save it somewhere (I use docs/agents) once you're happy with it.

Step 3: tell them to build a detailed plan to meet the objectives of the doc.

Step 4: let them go wild.

Step 5: once they declare "done", feed their progress to another LLM (Gemini is quite decent for review, and free) -> mindlessly feed the feedback back to the implementing LLM.

Step 6: Say the magic words: https://github.com/cuzzo/clear/blob/master/docs/retrospectiv...

Again, I've found no amount of skills or CLAUDE.md beats slightly modifying a prompt to meet your exact goals specific to the design and what you know of the implementation so far.

Step 7: Have them rebuild a plan to address feedback.

Step 8: Let them go wild. Loop back to Step 5 until the LLMs tell you there's no major action items.

Step 9: Tell them to remove anything from the commit that's not strictly necessary, get rid of comment changes that aren't strictly necessary, etc.

Step 10: here and only here do you invest your time (worth 100x what you're paying them) to look at what they did. Here you can give them feedback to address anything you saw.

Step 11: Review.

Step 12: Profit $$$

I got a quite decent implementation of Finite State Machine and Thunk + Trampoline transformation of code in custom language I'm building in about 1 day, barely checking in while commuting to and from work on the train...

Occassionally, at step 11, you will find a gigantic turd and wonder how the LLMs converged on this. But, typically, it's at least good enough at that stage.

I don't even waste my time looking at anything they've done until they've converged on a good design and implementation with no holes, no feedback, no notes that does what a minimal, summarized doc clearly states and follows the design principles. Because they DEFINITELY haven't in a one-shot.

No, there are millions upon millions of mediocre lines of code out there.

And LLMs tend to converge on mediocrity. Which is totally fine.

  • I find that there’s a surprising amount of really good stuff out there.

    PHP has come of age. Actually, it’s been a backbone technology for millions of professional sites and apps for many years, and people tend to work in the open. Sort of the nature of the language.

    There’s a popular perception that PHP programmers are bad programmers, but that’s a dated point of view. Pros have been using it to make serious money, and create serious infrastructure, for many years.

    • I am not dissing PHP, I am saying that the absolute majority of any code out there is mediocre and not super good and not super bad.

      1 reply →