Comment by jihadjihad

1 month ago

In my experience asking the model to construct an automated test suite, with no additional context, is asking for a bad time. You'll see tests for a custom exception class that you (or the LLM) wrote that check that the message argument can be overwritten by the caller, or that a class responds to a certain method, or some other pointless and/or tautological test.

If you start with an example file of tests that follow a pattern you like, along with the code the tests are for, it's pretty good at following along. Even adding a sentence to the prompt about avoiding tautological tests and focusing on the seams of functions/objects/whatever (integration tests) can get you pretty far to a solid test suite.

2 comments

jihadjihad

kaydub 1 month ago

1 agent writes the tests, threads the needle.

Another agent reviews the tests, finds duplicate code, finds poor testing patterns, looks for tests that are only following the "happy path", ensures logic is actually tested and that you're not wasting time testing things like getters and setters. That agent writes up a report.

Give that report back to the agent that wrote the test or spin up a new agent and feed the report to it.

Don't do all of this blindly, actually read the report to make sure the llm is on the right path. Repeat that one or two times.

matltc 1 month ago

Yeah I've seen this too. Bangs out five hundred line unit test file, but half of them are as you describe.

Just writing one line in CLAUDE.md or similar saying "don't test library code; assume it is covered" works.

Half the battle with this stuff is realizing that these agents are VERY literal. The other half is paring down your spec/token usage without sacrificing clarity.