← Back to context

Comment by tokenscoper

41 minutes ago

> but the engineers did not add a sufficiently thorough e2e test case to test the tool call against unrelated email verification attempts being provided to the tool call.

I'd go out on a limb to say the tests were likely AI generated. It's easy to miss a case like this one given that models like to generate a ton of test code that 'look' good at a glance but have subtle logic bugs that could potentially defeat the purpose of the test itself.

My own anecdata here, Claude generated a JUnit test with all the right setup, but missed a crucial assertion (there were very many other minor assertions) which made the test useless mostly.