Comment by teddyh
6 years ago
> I am not working on anything like this. I work on a lighting automation system. If I fuck up my release, nobody is going to die (well, most likely, there are some scenarios), but if I fuck up sufficiently, a lot of people will be incredibly annoyed. So I have every version of every dependency frozen. All the way down the dependency tree.
Most people’s systems are not that special that they absolutely need to do this, but it feeds one’s ego to imagine that it is. And, as I said, it feels more painful, but it’s actually easier to do this – i.e. being a hardass about new versions – than to do the legitimately hard job of integrating software and having good test coverage. It feeds the ego and feels useful and hard, but it’s actually easy; it’s no wonder it’s so very, very easy to fall into this trap. And once you’ve fallen in by lagging behind in this way, it’s even harder to climb out of it, since that would mean upgrading everything even faster to catch up. If you’re mostly up to date, you can afford to allow a single dependency to lag behind for a while to avoid some specific problem. But if you’re using all old unsupported stuff and there’s a critical security bug with no patch for your version, since the design was inherently buggy, you’re utterly hosed. You have no safety margin.
This is the danger of living on the edge of EoL, as you called it. Once you are forced to update a packages, usually with short notice at the most inconvenient time ever due to some zero-day vulnerability found in one of your depdendencies. Then the new version no longer supports another old version it has a subdependency to, which you also pinned, so you have to update the subdependency also. And to upgrade that package you have to update yet another subdependency, and so on.
Suddenly a small security-patch forces you to essentially replace your whole stack. If your test flags any error you have no idea which of the updated subpackages that caused it, because you have replaced all of them. Eventually you accumulate so much tech debt that it's tempting to cherry-pick the security patches into your packages instead of updating them to mainline, sucking you even deeper down the tech debt trap.
Integrating often means each integration is smaller, less risky and easier to pinpoint why the failure happens. Of course this assumes you have good automatic test coverage, which i assume you do if the systems are as life-critical as parent claim them to be.
There's also a big difference between embedded and connected systems here. Embedded SW usually get flashed once and then just do whatever they are supposed to do. Such SW really is "done", there is no need to maintain it or it's dependencies because it's not connected to the internet so zero-days or other vulnerabilities are not really a thing.