← Back to context

Comment by ibizaman

14 hours ago

I actually used property testing very successfully to test a DB driver and a migration to another DB driver in Go. I wrote up about it here https://blog.tiserbox.com/posts/2024-02-27-stateful-property...

Thanks for sharing! Your article illustrates well the benefits of this approach.

One drawback I see is that property-based tests inevitably need to be much more complex than example-based ones. This means that bugs are much more likely, they're more difficult to maintain, etc. You do mention that it's a lot of code, but I wonder if the complexity is worth it in the long run. I suppose that since testing these scenarios any other way would be even more tedious and error-prone, the answer is "yes". But it's something to keep in mind.

  • > One drawback I see is that property-based tests inevitably need to be much more complex than example-based ones.

    I don’t think that’s true, I just think the complexity is more explicit (in code) rather than implicit (in the process of coming up with examples). Example-based testing usually involves defining conditions and properties to be tested, then involves constructing sets of examples to test them and which attempt to anticipated edge cases from the description of the requirements (black box) or from knowledge of how the code is implemented (white box).

    Property based testing involves defining the conditiosn and properties, writing code that generates the conditions and for each property writing a bit of code that can refute it by passing if and only if it is true of the subject under test for a particular set of inputs.

    With a library like Hypothesis which both has good generators for basic types and good abstractions for combining and adapting generators, the latter seems to be less complex overall, as well as moving the complexity into a form where it is explicit and easy to maintain/adapt, whereas adapting example-based tests to requirements changes involves either throwing out examples and starting over or revalidating and updating examples individually.

  • My experience is that PBT tests are mostly hard in devising the generators, not in the testing itself.

    Since it came up in another thread (yes, it's trivial), a function `add` is no easier or harder to test with examples than with PBT, here are some of the tests as both PBT-style and example-based style:

      @given(st.integers())
      def test_left_identity_pbt(a):
        assert add(a, 0) == a
    
      def test_left_identity():
        assert add(10, 0) == 10
    
      @given(st.integers(), st.integers())
      def test_commutative(a, b):
        assert add(a, b) == add(b, a)
    
      @parametrize("a,b", examples)
      def test_commutative():
        assert add(a, b) == add(b, a)
    

    They're the same test, but one is more comprehensive than the other. And you can use them together. Supposing you do find an error, you add it to your example-based tests to build out your regression test suite. This is how I try to get people into PBT in the first place, just take your existing example-based tests and build a generator. If they start failing, that means your examples weren't sufficiently comprehensive (not surprising). Because PBT systems like Hypothesis run so many tests, though, you may need to either restrict the number of generated examples for performance reason or breakup complex tests into a set of smaller, but faster running, tests to get the benefit.

    Other things become much simpler, or at least simpler to test comprehensively, like stateful and end-to-end tests (assuming you have a way to programmatically control your system). Real-world, I used Hypothesis to drive an application by sending a series of commands/queries and seeing how it behaved. There are so many possible sequences that manually developing a useful set of end-to-end tests is non-trivial. However, with Hypothesis it just generated sequences of interactions for me and found errors in the system. After each command (which may or may not change the application state) it issued queries in the invariant checks and verified the results against the model. Like with example-based testing, these can be turned into hard-coded examples in your regression test suite.