Comment by smarnach
17 hours ago
> we should default to the calculation of 2-4x the rate.
No we should not. We should accept that we don't have any statistically meaningful number at all, since we only have a single incident.
Let's assume we roll a standard die once and it shows a six. Statistically, we only expect a six in one sixth of the cases. But we already got one on a single roll! Concluding Waymo vehicles hit 2 to 4 times as many children as human drivers is like concluding the die in the example is six times as likely to show a six as a fair die.
More data would certainly be better, but it's not as bad as you suggest -- the large number of miles driven till first incident does tell us something statistically meaningful about the incident rate per mile driven. If we view the data as a large sample of miles driven, each with some observed number of incidents, then what we have is "merely" an extremely skewed distribution. I can confidently say that, if you pick any sane family of distributions to model this, then after fitting just this "single" data point, the model will report that P(MTTF < one hundredth of the observed number of miles driven so far) is negligible. This would hold even if there were zero incidents so far.
We get a statistically meaningful result about an upper bound of the incident rate. We get no statistically meaningful lower bound.
Uh, the miles driven is like rolling the die, not hitting kids.
Sure, but we shouldn't stretch the analogy too far. Die rolls are discrete events, while miles driven are continuous. We expect the number of sixes we get to follow a binomial distribution, while we expect the number of accidents to follow a Poisson distribution. Either way, trying to guess the mean value of the distribution after a single incident of the event will never give you a statistically meaningful lower bound, only an upper bound.
The Poisson distribution is well approximated by the binomial distribution when n is high and p is low, which is exactly the case here. Despite the high variance in the sample mean, we can still make high-confidence statements about what range of incident rates are likely -- basically, dramatically higher rates are extremely unlikely. (Not sure, but I think it will turn out that confidence in statements about the true incident rate being lower than observed will be much lower.)