Comment by CoastalCoder

21 hours ago

> Always include some randomness in test values.

If this isn't a joke, I'd be very interested in the reasoning behind that statement, and whether or not there are some qualifications on when it applies.

14 comments

CoastalCoder

j1mr10rd4n 12 hours ago

There's another good reason that hasn't been detailed in the comments so far: expressing intent.

A test should communicate its reason for testing the subject, and when an input is generated or random, it clearly communicates that this test doesn't care about the specific _value_ of that input, it's focussed on something else.

This has other beneficial effects on test suites, especially as they change over the lifetime of their subjects:

* keeping test data isolated, avoiding coupling across tests * avoiding magic strings * and as mentioned in this thread, any "flakiness" is probably a signal of an edge-case that should be handled deterministically and * it's more fun [1]

[1] https://arxiv.org/pdf/2312.01680

dathinab 21 hours ago

humans are very good at overlooking edge cases, off by one errors etc.

so if you generate test data randomly you have a higher chance of "accidentally" running into overlooked edge cases

you could say there is a "adding more random -> cost" ladder like

- no randomness, no cost, nothing gained

- a bit of randomness, very small cost, very rarely beneficial (<- doable in unit tests)

- (limited) prop testing, high cost (test runs multiple times with many random values), decent chance to find incorrect edge cases (<- can be barely doable in unit tests, if limited enough, often feature gates as too expensive)

- (full) prop testing/fuzzing, very very high cost, very high chance incorrect edge cases are found IFF the domain isn't too large (<- a full test run might need days to complete)

ssdspoimdsjvv 20 hours ago
I've learnt that if a test only fails sometimes, it can take a long time for somebody to actually investigate the cause,in the meantime it's written off as just another flaky test. If there really is a bug, it will probably surface sooner in production than it gets fixed.
- tomjakubowski 16 hours ago
  
  Flaky tests are a very strong signal of a bug, somewhere. Problem is it's not always easy to tell if the bug's in the test or in the code under test. The developer who would rather re-run the test to make it pass than investigate probably thinks it's the test which is buggy.
- fc417fc802 8 hours ago
  
  > it's written off as just another flaky test
  So don't do that. That's bad practice. The test has failed for a reason and that needs to be handled.
- dathinab 20 hours ago
  
  sadly yes
  people often take flaky test way less serious then they should
  I had multiple bigger production issues which had been caught by tests >1 month before they happened in production, but where written off as flaky tests (ironically this was also not related to any random test data but more load/race condition related things which failed when too many tests which created full separate tenants for isolation happened to run at the same time).
  And in some CI environments flaky test are too painful, so using "actual" random data isn't viable and a fixed seed has to be used on CI (that is if you can, because too much libs/tools/etc. do not allow that). At least for "merge approval" runs. That many CI systems suck badly the moment you project and team size isn't around the size of a toy project doesn't help either.
SkyBelow 19 hours ago
Can't one get randomness and determinism at the same time? Randomly generate the data, but do so when building the test, not when running the test. This way something that fails will consistently fail, but you also have better chances of finding the missed edge cases that humans would overlook. Seeded randomness might also be great, as it is far cleaner to generate and expand/update/redo, but still deterministic when it comes time to debug an issue.
- tomjakubowski 16 hours ago
  
  Most test frameworks I have seen that support non-determinism in some way print the random seed at the start of the run, and let you specify the seed when you run the tests yourself. It's a good practice for precisely the reasons you wrote.

whynotmaybe 21 hours ago

Must be some Mandela effect about some TDD documentation I read a long time ago.

If you test math_add(1,2) and it returns 3, you don't know if the code does `return 3` or `return x+y`.

It seems I might need to revise my view.

Izkata 21 hours ago
I vaguely remember the same advice, it's pretty old. How you use the randomness is test specific, for example in math_add() it'd be something like:
jitter = random(5) assertEqual(3 + jitter, math_add(1, 2 + jitter))
If it was math_multiply(), then adding the jitter would fail - that would have to be multiplied in.
Nowadays I think this would be done with fuzzing/constraint tests, where you define "this relation must hold true" in a more structured way so the framework can choose random values, test more at once, and give better failure messages.
- whynotmaybe 18 hours ago
  
  > it's pretty old.
  Damn, must be why only white hair is growing on my head now.
  >Nowadays I think this would be done with fuzzing/constraint tests, where you define "this relation must hold true" in a more structured way so the framework can choose random values, test more at once, and give better failure messages.
  So the concept of random is still there but expressed differently ? (= Am I partially right ?)
  
  1 reply →
ajs1998 21 hours ago

Randomness is useful if you expect your code to do the correct thing with some probability. You test lots of different samples and if they fail more than you expect then you should review the code. You wouldn't test dynamic random samples of add(x, y) because you wouldn't expect it to always return 3, but in this case it wouldn't hurt.
brewmarche 18 hours ago

This sounds like the idea behind mutation testing