Comment by dkarl
11 hours ago
> Once my property tests were running overnight and not finding any failures
Quickly addressing the time it takes to run property-based tests, especially in Python: it's extremely helpful to make sure that running subsets of unit tests is convenient and documented and all your developers are doing it. Otherwise they may revolt against property-based testing because of the time it takes to run tests. Again, especially in Python.
I may have given a misleading impression. Each property test took milliseconds to run, and FsCheck defaults to generating 100 random inputs for each test. Running the whole test suite took 5-10 minutes depending on whether I ran the longer tests or skipped them (the tests that generated very large lists, then split and concatenated them several times, took longer than the rest of the test suite combined). What I was doing that ran overnight was the stress testing feature of the Expecto test runner (https://github.com/haf/expecto?tab=readme-ov-file#stress-tes...), where instead of running 100 tests for each property test you define, it keeps on generating random input and running tests for a fixed length of time. I would set it to run for 8 hours then go to bed. In the morning I would look at the millions (literally millions, usually between 2-3 million) of random tests that had been run, all of which were passing, and say "Yep, I probably don't have any bugs left".
That's pretty cool, and now I'm curious if there's something similar for ScalaCheck. My comment comes from my own experience, though, introducing Hypothesis and ScalaCheck into codebases and quickly causing noticeable increases in unit test times. I think the additional runtime for tests is undoubtedly worth it, but maybe not a good trade-off when people are used to running unit tests several times an hour as part of their development cycle. To avoid people saying, "Running four minutes of tests five times per hour is ruining my flow and productivity," I make sure they have a script or command to run a subset of basic, less comprehensive tests, or to only run the tests relevant to the changes they've made.
Or a watch command that runs tests in the background on save, and an IDE setting to flag code when the watched tests produce a failure. Get used to that, and it's not even a matter of stopping to run the tests: they run every time you hit Ctrl-S, and you just keep on typing — and every so often the IDE notifies you of a failed test.
The drawback is that you might get used to saying "Well, of course the tests failed, I'm in the middle of refactoring and haven't finished yet" and ignoring the failed-test notifications. But just like how I'm used to ignoring typecheck errors until I finish typing but then I look at them and see if they're still there, you probably won't get used to ignoring test failures all the time. Though having a five-minute lag time between "I'm finished typing, now those test errors should go away" and having them actually go away might be disconcerting.
1 reply →