← Back to context

Comment by kiitos

2 days ago

Right, so -- 'you think that you're "deciding what gets built and how it's designed" by iterating on the prompts that you feed to the LLM that generates the code'

> My prompts specify very precisely what should be implemented.

And the precision of your prompt's specifications, has no reliable impact on exactly what code the LLM returns as output.

> With the details I provided, combined with the OAuth spec, there was really very little room left for any creativity in the code. It was basically connect-the-dots at that point.

I truly don't know how you can come to this conclusion, if you have any amount of observed experience with any of the current-gen LLM tools. No amount of prompt engineering gets you a reliable mapping from input query to output code.

> I designed the end-to-end encryption scheme and told it in detail how to implement it. I pointed out bugs and explained how to fix them. And so on.

I guess my response here is that, if you think that this approach to prompt engineering gets you a generated code result that is in any sense equivalent, or even comparable, in terms of quality, to the work that you could produce yourself, as a professional and senior-level software engineer, then, man, we're on different planets. Pointing out bugs and explaining how to fix them in your prompts in no way gets you deterministic, reliable, accurate, high-quality code as output. And actually forget about high-quality, I mean even just bare minimum table-stakes requirements-satisfying stuff.. !

Nobody has claimed to be getting deterministic outputs from LLMs.

  • > My prompts specify very precisely what should be implemented. I specified the public API and high-level design upfront. I let the AI come up with its own storage schema initially but then I prompted it very specifically through several improvements (e.g. "denormalize this table into this other table to eliminate a lookup"). I designed the end-to-end encryption scheme and told it in detail how to implement it. I pointed out bugs and explained how to fix them. And so on.

    OK. Replace "[expected] deterministic output" with whatever term best fits what this block of text is describing, as that's what I'm talking about. The claim is that a sufficiently-precisely-specified prompt can produce reliably-correct code. Which is just clearly not the case, as of today.

    • I don't even think anybody expects reliably-correct code. They expect code that can be made as reliably as they themselves could make code, with some minimal amount of effort. Which clearly is the case.

      11 replies →