← Back to context

Comment by simianwords

20 days ago

did you read the article?

>StrongDM’s answer was inspired by Scenario testing (Cem Kaner, 2003).

Tests are only rigorous if the correct intent is encoded in them. Perfectly working software can be wrong if the intent was inferred incorrectly. I leverage BDD heavily, and there a lot of little details it's possible to misinterpret going from spec -> code. If the spec was sufficient to fully specify the program, it would be the program, so there's lots of room for error in the transformation.

  • Then I disagree with you

    > You still have to have a human who knows the system to validate that the thing that was built matches the intent of the spec.

    You don't need a human who knows the system to validate it if you trust the LLM to do the scenario testing correctly. And from my experience, it is very trustable in these aspects.

    Can you detail a scenario by which an LLM can get the scenario wrong?

    • I do not trust the LLM to do it correctly. We do not have the same experience with them, and should not assume everyone does. To me, your question makes no sense to ask.

      6 replies →

    • The whole point is that you can't 100% trust the LLM to infer your intent with accuracy from lossy natural language. Having it write tests doesn't change this, it's only asserting that its view of what you want is internally consistent, it is still just as likely to be an incorrect interpretation of your intent.

      7 replies →

  • > If the spec was sufficient to fully specify the program, it would be the program

    Very salient concept in regards to LLM's and the idea that one can encode a program one wishes to see output in natural English language input. There's lots of room for error in all of these LLM transformations for same reason.