← Back to context

Comment by callc

9 days ago

Humans writing the test first and LLM writing the code is much better than the reverse. And that is because tests are simply the “truth” and “intention” of the code as a contract.

When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.

> When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.

You don’t need to write tests for that, you need to write acceptance criteria.

  • > You don’t need to write tests for that, you need to write acceptance criteria.

    Sir, those are called tests.

    • I see you have little experience with Scrum...

      Acceptance criteria is a human-readable text that the person specifying the software has to write to fill-up a field in Scrum tools and not at all guide the work of the developers.

      It's usually derived from the description by an algorithm (that the person writing it has to run on their mind), and any deviation from that algorithm should make the person edit the description instead to make the deviation go away.

      2 replies →

  • As in, a developer would write something in e.g. gherkin, and AI would automatically create the matching unit tests and the production code?

    That would be interesting. Of course, gherkin tends to just be transpiled into generated code that is customized for the particular test, so I'm not sure how AI can really abstract it away too much.

    • All of this at the end reduces to a simple fact at the end of the discussion.

      You need some of way of precisely telling AI what to do. As it turns out there is only that much you can do with text. Come to think of it, you can write a whole book about a scenery, and yet 100 people will imagine it quite differently. And still that actual photograph would be totally different compared to the imagination of all those 100 people.

      As it turns out if you wish to describe something accurately enough, you have to write mathematical statements, in other words statements that reduce to true/false answers. We could skip to the end of the discussion here, and say you are better of either writing code directly or test cases.

      This is just people revisiting logic programming all over again.

      2 replies →

    • I’m talking higher level than that. Think about the acceptance criteria you would put in a user story. I’m specifically responding to this:

      > When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.

      You don’t need to personally write code that mechanically iterates over every possible state to remain in the driver’s seat. You need to describe the acceptance criteria.

      11 replies →

    • > That would be interesting. Of course, gherkin tends to just be transpiled into generated code that is customized for the particular test, so I'm not sure how AI can really abstract it away too much.

      I don't think that's how gherkin is used. Take for example Cucumber. Cucumber only uses it's feature files to specify which steps a test should execute, whereas steps are pretty vanilla JavaScript code.

      In theory, nowadays all you need is a skeleton of your test project, including feature files specifying the scenarios you want to run, and prompt LLMs to fill in the steps required by your test scenarios.

      You can also use a LLM to generate feature files, but if the goal is to specify requirements and have a test suite enforce them, implicitly the scenarios are the starting point.

>>Humans writing the test first and LLM writing the code is much better than the reverse.

Isn't that logic programming/Prolog?

You basically write the sequence of conditions(i.e tests in our lingo) that have to be true, and the compiler(now AI) generates code for your.

Perhaps there has to be a relook on how Logic programming can be done in the modern era to make this more seamless.

Yes this is fundamental to actually designing software. Still, it would be perfectly reasonable to ask "please write a test which gives y output for x input".

I disagree. You can simply code in a way that all test passes and you have more problem than before reviewing the code that is being generated.