Comment by ThunderSizzle

10 days ago

As in, a developer would write something in e.g. gherkin, and AI would automatically create the matching unit tests and the production code?

That would be interesting. Of course, gherkin tends to just be transpiled into generated code that is customized for the particular test, so I'm not sure how AI can really abstract it away too much.

16 comments

ThunderSizzle

kamaal 10 days ago

All of this at the end reduces to a simple fact at the end of the discussion.

You need some of way of precisely telling AI what to do. As it turns out there is only that much you can do with text. Come to think of it, you can write a whole book about a scenery, and yet 100 people will imagine it quite differently. And still that actual photograph would be totally different compared to the imagination of all those 100 people.

As it turns out if you wish to describe something accurately enough, you have to write mathematical statements, in other words statements that reduce to true/false answers. We could skip to the end of the discussion here, and say you are better of either writing code directly or test cases.

This is just people revisiting logic programming all over again.

motorest 10 days ago
> You need some of way of precisely telling AI what to do.
I think this is the detail you are not getting quite right. The truth of the matter is that you don't need precision to get acceptable results, at least in 100% of the cases. As everything in software engineering, there is indeed "good enough".
Also worth noting, LLMs allow anyone to improve upon "good enough".
> As it turns out if you wish to describe something accurately enough, you have to write mathematical statements, in other words statements that reduce to true/false answers.
Not really. Nothing prevents you to refer to high-level sets of requirements. For example, if you tell a LLM "enforce Google's style guide", you don't have to concern yourself with how many spaces are in a tab. LLMs have been migrating towards instruction files and prompt files for a while, too.
- kamaal 10 days ago
  
  Yes, you are right. But in the sense that a human decides if AI generated code is right.
  But if you want a near 100% automation, you need precise way to specify what you want, else there is no reliable way interpreting what you mean. And by that definition lots of regression/breakage has to be endured everytime a release is made.

JimDabell 10 days ago

I’m talking higher level than that. Think about the acceptance criteria you would put in a user story. I’m specifically responding to this:

> When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.

You don’t need to personally write code that mechanically iterates over every possible state to remain in the driver’s seat. You need to describe the acceptance criteria.

motorest 10 days ago
> When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.
You're describing the happy path of BDD-style testing frameworks.
- JimDabell 10 days ago
  
  I know about BDD frameworks. I’m talking higher level than that.
  
  5 replies →
skydhash 10 days ago
I think your perspective is heavily influenced by the imperative paradigm where you actually write the state transition. Compare that to functional programming where you only describe the relation between the initial and final state. Or logic programming where you describe the properties of the final state and where it would find the elements with those properties in the initial state.
Those does not involves writing state transitions. You are merely describing the acceptance criteria. Imperative is the norm because that's how computers works, but there are other abstractions that maps more to how people thinks. Or how the problem is already solved.
- JimDabell 10 days ago
  
  I didn’t mention state transitions. When I said “mechanically iterate over every possible state”, I was referring to writing tests that cover every type of input and output.
  Acceptance criteria might be something like “the user can enter their email address”.
  Tests might cover what happens when the user enters an email address, what happens when the user tries to enter the empty string, what happens when the user tries to enter a non-email address, what happens when the user tries to enter more than one email address…
  In order to be in the driver’s seat, you only need to define the acceptance criteria. You don’t need to write all the tests.
  
  1 reply →
- sitkack 10 days ago
  
  Acceptance criteria describes the thing being accepted, it describes a property of the final state.
  There is no prescriptive manner in which to deliver the solution, unless it was built into the acceptance criteria.
  You are not talking about the same thing as the parent.

motorest 10 days ago

> That would be interesting. Of course, gherkin tends to just be transpiled into generated code that is customized for the particular test, so I'm not sure how AI can really abstract it away too much.

I don't think that's how gherkin is used. Take for example Cucumber. Cucumber only uses it's feature files to specify which steps a test should execute, whereas steps are pretty vanilla JavaScript code.

In theory, nowadays all you need is a skeleton of your test project, including feature files specifying the scenarios you want to run, and prompt LLMs to fill in the steps required by your test scenarios.

You can also use a LLM to generate feature files, but if the goal is to specify requirements and have a test suite enforce them, implicitly the scenarios are the starting point.