Comment by agentultra
14 days ago
This is not a good idea.
If you want better tests with more cases exercising your code: write property based tests.
Tests form an executable, informal specification of what your software is supposed to do. It should absolutely be written by hand, by a human, for other humans to use and understand. Natural language is not precise enough for even informal specifications of software modules, let alone software systems.
If using LLM's to help you write the code is your jam, I can't stop you, but at least write the tests. They're more important.
As an aside, I understand how this antipathy towards TDD develops. People write unit tests, after writing the implementation, because they see it as boilerplate code that mirrors what the code they're testing already does. They're missing the point of what makes a good test useful and sufficient. I would not expect generating more tests of this nature is going to improve software much.
Edit added some wording for clarity
The confusion in this article about what TDD is demonstrates how far everything has drifted. It's interesting in terms of what it achieves, but I don't think it's useful as a comment on TDD (or, for that matter, testing).
I got massive productivity gains from having an LLM fill out my test suite.
It is like autocomplete and macros... "Based on these two unit tests, fill out the suite considering b, c, and d. Add any critical corner case tests I have missed or suggest them if they don't fit well."
It is on the human to look at the generated test to ensure a) they are comprehensive and b) useful and c) communicate clearly
Can you extend that - what was the domain, how did you start? I would like to give this a try but am not quite sure I get it?
Backend coding for web services.
In the past I would hand write 8 or 9 unit tests. Now I write the first one or two and then brain dump anything else into the LLM prompt. It then outputs mine plus 6 or more.
I delete any that seem low value or ridiculous or have a follow up prompt to ask for refinements. Then just copy/pasta back into the codebase out of the chat.
2 replies →
See, I’m arguing for writing fewer, better tests.
I realize that it’s the norm to rely heavily on unit tests. Hundreds or thousands of examples of inputs and outputs. We still find errors in programs. “Examples prove the presence of an error, not the absence of errors,” as Djikstra (or was it Hoare? I can’t remember) would say. So I understand how one could view having an LLM generate tests being a win for productivity in that case.
But such test suites don’t add much. And generating 20 more tests won’t tell me much more about the code. It will actually make the test suite harder to read and understand.