Comment by tokenscoper
41 minutes ago
> but the engineers did not add a sufficiently thorough e2e test case to test the tool call against unrelated email verification attempts being provided to the tool call.
I'd go out on a limb to say the tests were likely AI generated. It's easy to miss a case like this one given that models like to generate a ton of test code that 'look' good at a glance but have subtle logic bugs that could potentially defeat the purpose of the test itself.
My own anecdata here, Claude generated a JUnit test with all the right setup, but missed a crucial assertion (there were very many other minor assertions) which made the test useless mostly.
No comments yet
Contribute on Hacker News ↗