Comment by bedrio
1 month ago
I'm worried about that too. If the error is reproducible, the model can eventually figure it out from experience. But a ghost bug that I can't pattern? The model ends up in a "you're absolutely right" loop as it incorrectly guesses different solutions.
Are ghost bugs even real?
My first job had the Devs working front-line support years ago. Due to that, I learnt an important lessons in bug fixing.
Always be able to re-create the bug first.
There are no such thing as ghost bugs, you just need to ask the reporter the right questions.
Unless your code is multi-threaded, to which I say, good luck!
They're real at scale. Plenty of bugs don't suface until you're running under heavy load on distributed infrastructure. Often the culprit is low in the stack. Asking the reporter the right questions may not help in this case. You have full traces, but can't reproduce in a test environment.
When the cause is difficult to source or fix, it's sometimes easier to address the effect by coding around the problem, which is why mature code tends to have some unintuitive warts to handle edge cases.
> Unless your code is multi-threaded, to which I say, good luck!
What isn't multi-threaded these days? Kinda hard to serve HTTP without concurrency, and practically every new business needs to be on the web (or to serve multiple mobile clients; same deal).
All you need is a database and web form submission and now you have a full distributed system in your hands.
Only superficially so, await/async isn't usually like the old spaghetti multi-threaded code people used to write.
1 reply →
nginx is single–threaded, but you're absolutely right — any concurrency leads to the same ghost bugs.
1 reply →
Historically I would have agreed with you. But since the rise of LLM-assisted coding, I've encountered an increasing number of things I'd call clear "ghost bugs" in single threaded code. I found a fun one today where invoking a process four times with a very specific access pattern would cause a key result of the second invocation to be overwritten. (It is not a coincidence, I don't think, that these are exactly the kind of bugs a genAI-as-a-service provider might never notice in production.)