← Back to context

Comment by chickensong

1 month ago

They're real at scale. Plenty of bugs don't suface until you're running under heavy load on distributed infrastructure. Often the culprit is low in the stack. Asking the reporter the right questions may not help in this case. You have full traces, but can't reproduce in a test environment.

When the cause is difficult to source or fix, it's sometimes easier to address the effect by coding around the problem, which is why mature code tends to have some unintuitive warts to handle edge cases.