← Back to context

Comment by embedding-shape

23 days ago

> They don't have to be "good" tests, just good enough tests to get the AI writing not crap code. Think very small unit tests that you normally wouldn't think about writing yourself.

Yeah, those for me are all not "good tests", you don't want them in your codebase if you're aiming for a long-term project. Every single test has to make sense and be needed to confirm something, and should give clear signals when they fail, otherwise you end locking your entire codebase to things, because knowing what tests are actually needed or not becomes a mess.

Writing the tests and let the AI write the implementation ends you up with code you know what it does, and can confidently say what works vs not. When the IA ends up writing the tests, you often don't actually know what works or not, not even by scanning the test titles you often don't learn anything useful. How is one supposed to be able to guarantee any sort of quality like that?

If it clarifies anything, I have my workflow (each step is a separate prompt without preserved conversation context):

1 Create a test plan for N tests from the description. Note that this step doesn't provide specific data or logic for the test, it just plans out vaguely N tests that don't overlap too much.

2 Create an interface from the description

3 Create an implementation strategy from the description

4.N Create N tests, one at a time, from the test plan + interface (make sure the tests compile) (note each test is created in its own prompt without conversation context)

5 Create code using interface + implementation strategy + general knowledge, using N tests to validate it. Give feedback to 4.I if test I fails and AI decides it is the test's fault.

If anything changes in the description, the test plan is fixed, the tests are fixed, and that just propagates up to the code. You don't look at the tests unless you reach a situation where the AI can't fix the code or the tests (and you really need to help out).

This isn't really your quality pass, it is crap filter pass (the code should work in the sense that a programmer wrote something that they thinks works, but you can't really call it "tested" yet). Maybe you think I was claiming that this is all the testing that you'll need? No, you still need real tests as well as these small tests...