Comment by wizzwizz4

2 months ago

You can guarantee that all the cases in the code are tested. That doesn't necessarily mean that all the behaviour is tested. If two implementations use very different approaches, which happen to have different behaviour on the Mersenne primes (for deep mathematical reasons), but one of them special-cases byte values using a lookup table generated from the other, you wouldn't expect mutation testing to catch the discrepancy. Each implementation is still the local optimum as far as passing tests is concerned, and the mutation test harness wouldn't know that "disable the small integer cache" is the kind of mutation that shouldn't affect whether tests pass.

There are only 8 32-bit Mersenne primes, 4 of which are byte-valued. Fuzzing might catch the bug, if it happened to hit one of the four other 32-bit Mersenne primes (which, in many fuzzers, is more likely than a uniform distribution would suggest), but I'm sure you can imagine situations where it wouldn't.

4 comments

wizzwizz4

boxed 2 months ago

> but one of them special-cases byte values using a lookup table generated from the other, you wouldn't expect mutation testing to catch the discrepancy

Sure you would. If the mutation tester mutates that lookup table. Which is quite easy to do, and which mutmut will do (if that lookup table is inside a function, because mutmut is based on mutant schemata).

wizzwizz4 2 months ago

If the mutation tester mutates that lookup table, then that will eventually lead to all entries in the lookup table being tested. That does not mean that the four divergent values outside the lookup table will end up being tested.

JonChesterfield 2 months ago

I think if you hit full path coverage in each of them independently and run all the cases through both and check they're consistent you're still done.

Or branch coverage for the lesser version, the idea is still to generate interesting cases based on each implementation, not based solely on one of them.

wizzwizz4 2 months ago

If the buggy implementation relies indirectly on the assumption that 2^n - 1 is composite, by performing a calculation that's only valid for composite values on a prime value, there won't be a separate path for the failing case. If the Mersenne numbers don't affect flow control in a special way in either implementation, there's no reason for the path coverage heuristic to produce a case that distinguishes the implementations.