Comment by chuckadams

18 hours ago

Can't disagree with anything you said, though I think there are steps to address at least some of them: for queueing systems, testing with a chaos monkey isn't a bad idea... you'd want a test environment representative of production workloads, which is hard to do, but anything should be better than nothing.

In the self-driving car scenario, you'd probably go with cold statistics: is it killing fewer people than ones that need more interventions? Just like queueing though, experiments in production could be problematic.

2 comments

chuckadams

avidiax 12 hours ago

You can look at airport security as an example. Bombs and guns are quite rare in carry-on luggage. It would be far too boring for most operators, which would mean that they tend to tune out of their screening job.

So what the x-ray interface does, is randomly insert guns and bombs into the scan at a relatively frequent rate. The operator must click on these problem areas. If it is a synthetic object, it then disappears and the operator can continue screening the bag. If it isn't synthetic, the bag gets shunted for manual inspection.

So for a self-driving car, if it must be monitored (it's not L5 within the driving mission), then you would perhaps need the car to randomly ask the driver to take over, even though it's unnecessary. Or randomly appear to be making a mistake, to see if the user reacts.

If the user doesn't react appropriately or in time, then self-driving is disabled for a time the next time the car starts.

For the queuing system, it perhaps makes sense to inject a certain number of duplicates by default. Say 0.1%. Enough that it simply can't be ignored during development of the clients. Then, when duplicates arise as a consequence of system failures, the code is already expecting this and there's no harm to the workload.

worik 16 hours ago

> In the self-driving car scenario, you'd probably go with cold statistics

No. There is a big difference in an accident caused by human error and an accident caused by machine failure.

We tolerate much more of the former than the latter.

This feels like a cognitive failure, but I do not think it is