Comment by userbinator
9 years ago
In general, people expect hardware to be far more bug-free than all the software which depends on it. I'm disgusted enough at the current trend of continuously releasing half-baked software and "updating" it to fix some of the bugs (while introducing new bugs in the process); and yet it seems to be spreading to hardware too. From the perspective of formal verification, buggy hardware is like an incorrect axiom.
but Intel isn't allowed to have issues in their processors, even if they can fix them with a software (microcode) update?
They could've caught and fixed this one with some more testing, before actually releasing. Formal verification isn't necessarily going to help, if it's a statistical type of defect --- thus the concept of wafer yield, and why dies manufactured from the exact same masks can behave completely differently with respect to the clock speeds attainable and voltages required.
Formal verification cannot be used to fully verify an entire Intel CPU, simply because the computational resources required to do so are astronomical. Heck, the cache coherence unit alone is basically impossible to verify.
Therefore, CPU designers verify the important units (e.g., the ALU) independently of each other, and then try to verify their interaction. But a system level design verification check is simply not possible.
Obviously, faults that result from fabrication can still take place, so they are tested for using BIST and JTAG. Test coverage can be pretty high, but obviously not 100%.
As you can see, there are still a ton of places where hardware errors can seep through.
Expectations of hardware are indeed higher than for software, but isn't it a valid trade-off for Intel to make as long as most of the issues can be fixed with a microcode update? On Windows, it's updated via Windows Update, and on most Linux distributions, a package exists for CPU microcode. Most users have an easy—if not automatic—way to obtain the updates.
I believe the main issue with formally verifying everything is that reasoning about parallel code is extremely hard and might well make the entire endeavour unattainable. It would be great to have formally verified CPUs, though.
> They could've caught and fixed this one with some more testing, before actually releasing.
Isn't this true of any bug?
Windows update applies CPU patches?!
I know they did AMT but that was a special case and different.
Seems like they do when it fixes serious bugs, a quick search yields e.g. https://support.microsoft.com/en-us/help/3064209/june-2015-i... ("This update fixes several issues with Intel CPUs that could cause computer crashes or functions incorrectly.")