← Back to context

Comment by shermantanktop

3 months ago

It is possible to radically increase your chances of success. You have to speak the LLM’s language, just like you write Java or Rust. But it doesn’t come with a language spec, so you get to figure it out by trial and error. And a model change means revisiting what works.

Lots of tips on how to do this out there but one thing I do is have it try, throw away everything it it did, and try again with a completely restated question based on the good bits in what it was able to produce.

E.g., if you ask for a web app that does X and it produces a working web app that doesn’t do X, throw that away and just ask for the web app scaffolding. You’ve still come out ahead even if you take over fully.

> Lots of tips on how to do this out there but one thing I do is have it try, throw away everything it it did, and try again with a completely restated question

This is the thing that worries me about AI/LLMs and how people "profess they're actually really useful when you use them right": the cliff to figuring out if they're useful is vertical.

"You’ve still come out ahead even if you take over fully."

I just finished a weeklong saga of redoing a bunch of Claude's work because instead of figuring out how to properly codegen some files it just manually updated them and ignored a bunch of failing tests.

With another human I can ask, "Hey, wtf were you thinking when you did [x]?" and peer into their mind-state. With Claude, it's time to stir the linear algebra again. How can I tell when I'm near a local or global maxima when all the prevailing advice is "I dunno man, just `git reset --hard origin/master` and start again but like, with different words I guess."

We have studies that show people feel like they're more productive using AI while they're actually getting less done [1], and "throw away everything it did and try again" based on :sparkle: vibes :sparkle: is the state of the art on how to "actually" use this stuff, I just feel more and more skeptical.

[1]: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...