← Back to context

Comment by ceejayoz

1 year ago

Overnight, planes tend to be plugged in to ground power, to ventilate, keep the batteries charged, for the cleaning crews, etc. Most get rebooted once in a while, but it's always possible one won't be, hence the directive to be certain.

This particular problem has been known for years (the article is from 2020).

Unfortunately, an aircraft has no “reboot”. It is just a violent power cut. A lot of headache is introduced in non-critical aircraft software because there is no “graceful shutdown” or long power duration. Infact, certain hardware has an upper limit(much lower than a week) before which it needs one power cut(sometimes called power cycle) or it suffers from various buffer overflow, counter overflow and starts acting mysterious.

  • It's amazing that's legal. Like, why do we accept software that does this? It can be done in such a way that these things don't happen.Put another way, why aren't the companies involved being fined and sued out of business? Why aren't their managers facing criminal negligence charges? It's outrageous.

    • Because there has never been a single commercial jetliner fatality caused by software in its intended operational domain failing to operate according to specification. That makes the commercial jetliner software development and deployment process by far the safest and highest reliability ever conceived by multiple orders of magnitude. We are talking in the 10-12 9s range.

      And just to get ahead of: “Well what about the 737 MAX”, that was a system specification error, not due to “buggy” software failing to conform to its specification. The software did what it was supposed to do, but it should not have been designed to do that given the characteristics of the plane and the safety process around its usage.

      18 replies →

    • Because changes to that software go through a enormous amount of testing, validating and documentation for a new baseline to become a flashable item. Meanwhile a always working workaround is needed now.

    • Have you even found the documentation around things like ACPI? It's kinda coupled with UEFI these days I think, and hell, I'm not even sure of the hardware boards/revisions aircraft makers are using these days... Are they still on BIOS? Or old-as-sin linux/RTOS kernels/microcontrollers?

      Point being, when you start talking about high QA systems, where the Quality is non-negotiable (you will have everything documented and tested); barring exec/managerial malfeasance in preventing that work from being done, you reach for the same simple things over and over again since it takes a hell of a lot of work to actually characterize and certify a thing to the requisite level of reliability/operating conditions.

      Testing ain't free, ya know.

  • This has to be a joke right ?

    You're telling me Aerospace's "real engineering-level" is worse than something a sophomore can cook up ?

    • The testing for aerospace is extremely rigorous ... For DO-178C level A (Catastrophic failure that can cause a crash or many fatal injuries) we're estimating 2 years to do MC/DC test coverage metric of a fairly basic software system that has two mechanical backups. And that's above and beyond the extensive unit tests.

      The main thing that gets checked is the worst-case timing analysis for every branch condition. And there are stack monitors to monitor if the stack is growing in size.

      Look at Rapita System's website for more info ... we don't use them, but they explain it well.

  • >an aircraft has no “reboot”. It is just a violent power cut

    Guess how I typically reboot things :)

    • By traveling to Mexico and laying out bait along the migratory path of the butterflies?