← Back to context

Comment by IanCal

13 hours ago

I definitely had the same issue when I started, I think you've hit on the two main mental blocking points many people hit.

> this means you need a second function that acts the same as the function you wrote.

Actually no, you need to check that outcomes match certain properties. The example there works for a reimplementation (for all inputs, the output of the existing function and new one must be identical) but you can break down a sorting function into other properties.

Here are some for a sorting function that sorts list X into list Y.

Lists X and Y should be the same length.

All of the elements in list X should be in list Y.

Y[n] <= Y[n+m] for n+m<len(X)

If you support a reverse parameter, iterating backwards over the sorted list should match the reverse sorted list.

Sorting functions are a good simple case but we don't write them a lot and so it can be hard to take that idea and implement it elsewhere.

Here's another example:

In the app focus ELEMENT A. Press tab X times. Press shift-tab X times. Focus should be be on ELEMENT A.

In an app with two or more elements, focus ELEMENT A. Press tab. The focus should have changed.

In an app with N elements. Focus ELEMENT A. Press tab N times. The focus should be on ELEMENT A.

Those are a simple properties that should be true, are easy to test and maybe you'd have some specific set of tests for certain cases and numbers but there's a bigger chance you then miss that focus gets trapped in some part of your UI.

I had a library for making UIs for TVs that could be shaped in any way (so a carousel with 2 vertical elements then one larger one, then two again for example). I had a test that for a UI, if you press a direction and the focus changes, then pressing the other direction you should go back to where you were. Extending that, I had that same test run but combined with "for any sequence of UI construction API calls...". This single test found a lot of bugs in edge cases. But it was a very simple statement about using the navigation.

I had a similar test - no matter what API calls are made to change the UI, either there are no elements on screen, or one and only one has focus. This was defined in the spec, but we also had defined in the spec that if there was focus, removing everything and then adding in items meant nothing should have focus. These parts of the spec were independently tested with unit tests, and nobody spotted the contradiction until we added a single test that checked the first part automatically.

This I find exceptionally powerful when you have APIs. For any way people might use them, some basic things hold true.

I've had cases where we were identifying parts of text and things like "given a random string, adding ' thethingtofind ' results in getting at least a match for ' thethingtofind ', with start and end points, and when we extract that part of the string we get ' thethingtofind '". Sounds basic, right? Almost pointless to add when you've seen the output work just fine and you have explicit test cases for this? It failed. At one step we lowercased things for normalisation and we took the positions from this lowercased string. If there was a Turkish I in the string, when lowercased it became two characters:

    >>> len("İ") == len("İ".lower())
    False

So anything where this appeared before the match string would throw off the positions.

Have a look at your tests, I imagine you have some tests like

"Doing X results in Y" and "Doing X twice results in Y happening once". This is two tests expressing "Doing X n times where n>=1, Y happens". This is much more explicitly describing the actual property you want to test as well.

You probably have some tests where you parameterise a set of inputs but actually check the same output - those could be randomly generated inputs testing a wider variety of cases.

I'd also suggest that if you have a library or application and you cannot give a general statement that always holds true, it's probably quite hard to use.

Worst case, point hypothesis at a HTTP API, have it auto generate some tests that say "no matter what I send to this it doesn't return a 500".

In summary (and no this isn't LLM generated):

1. Don't test every case.

2. Don't check the exact output, check something about the output.

3. Ask where you have general things that are true. How would you explain something to a user without a very specific setup (you can't do X more than once, you always get back Y, it's never more than you put in Z...)

4. See if you have numbers in tests where the specific number isn't relevant.

5. See if you have inputs in your tests where the input is a developer putting random things in (sets of short passwords, whatever)