Comment by marcan_42
4 years ago
Don't forget it's not just instruction sets; Intel is the reason we don't have ECC RAM on desktops. Every other high density storage technology has used error correction for a decade or two, but we're still sitting here pretending we can have 512 billion bits of perfect memory sitting around that will never go wrong, because Intel fuse it off on desktop chips. I guess only servers need to be reliable.
AMD supports ECC on their consumer chips, but without Intel support it's never taken off and some motherboards don't support it, or if they do it's not clear in the documentation. I do use ECC RAM on my Threadripper machine and it does work, but I had to look for third party info on whether it would and dig around DMI and EDAC info to convince myself it was really on. It also makes it safer to overclock RAM since you get warnings when you're pushing things too far, before outright failures. And it helps with Rowhammer mitigation.
Apple M1s don't do ECC in the memory controller as far as I can tell, but at least they have a good excuse: you can't sensibly do ECC with 16-bit LPDDR RAM channels. There's no such excuse for 64/72-bit DIMM modules. I do hope we work out a way to make ECC available on mobile/LPDDR architectures in the future, though. Probably with something like in-RAM-die ECC (which for all I know might already be a thing on M1s; we don't have all the details).
> Don't forget it's not just instruction sets; Intel is the reason we don't have ECC RAM on desktops. Every other high density storage technology has used error correction for a decade or two, but we're still sitting here pretending we can have 512 billion bits of perfect memory sitting around that will never go wrong, because Intel fuse it off on desktop chips. I guess only servers need to be reliable.
And not just storage - the main memory bus is the only data bus in a modern computer that doesn't use some form of error correction or detection. Even USB 1.0 has a checksum. So everywhere else we use ECC/FEC or at least a checksum, be it PCIe, SATA, USB, all storage devices as you mentioned rely heavily on FEC, all CPU caches use ECC. Except the main memory and its bus. Where all data is moved through (eventually). D'uh.
Yup. PCIe will practically run over wet string, thanks to error detection and retransmits and other reasons, but try having a marginal DRAM bus and see how much fun that is...
Could be a fun way to test and demonstrate robustness of various parts of computer hardware, actually. It's already been done with ADSL for example:
[0] https://www.revk.uk/2017/12/its-official-adsl-works-over-wet...
3 replies →
> Intel is the reason we don't have ECC RAM on desktops.
Intel has offered ECC support in a lot of their low-end i3 parts for a long time. They’re popular for budget server builds for this reason.
The real reason people don’t use ECC is because they don’t like paying extra for consumer builds. That’s all. ECC requires more chips, more traces, and more expense. Consumers can’t tell if there’s a benefit, so they skip it.
> AMD supports ECC on their consumer chips, but without Intel support it's never taken off
You’re blaming Intel’s CPU lineup for people not using ECC RAM on their AMD builds?
Let’s be honest: People aren’t interested in ECC RAM for the average build. I use ECC in my servers and workstations, but I also accept that I’m not the norm.
> You’re blaming Intel’s CPU lineup for people not using ECC RAM on their AMD builds?
I'm blaming the decade+ of Intel dominance for killing any chance of ECC becoming popular in non-server environments, just as RAM density was reaching the point where it is absolutely essential for reliability.
> The real reason people don’t use ECC is because they don’t like paying extra for consumer builds. That’s all. ECC requires more chips, more traces, and more expense. Consumers can’t tell if there’s a benefit, so they skip it.
Motherboard traces are ~free and the feature is in the die already, so it requires zero expense to offer it to consumers. Intel chose to artificially cripple their chips to remove that option. Yes, I know there are a few oddball lines where they did offer it. They should have offered it across the board from the get go, seeing as they were selling the same dies with ECC for workstation use.
> I'm blaming the decade+ of Intel dominance for killing any chance of ECC becoming popular in non-server environments
I disagree. AMD has offered ECC support for a while and it’s not catching on. It doesn’t make sense to blame this on Intel.
> Motherboard traces are ~free and the feature is in the die already, so it requires zero expense to offer it to consumers.
Yet it’s missing from a substantial number of AMD boards, despite being supported. You have to specifically confirm the motherboard added those traces before buying it.
Traces aren’t entirely free. Modern boards are densely packed and manufacturers aren’t interested in spending extra time on routing for a feature that consumers aren’t interested in anyway.
5 replies →
ECC memory on the other hand is always going to be more expensive.
3 replies →
>You’re blaming Intel’s CPU lineup for people not using ECC RAM on their AMD builds?
Yes. ECC was standard on first IBM PC 5150, on PS/2 line, on pretty much all 286 clones etc. Intel killed ECC on the desktop when moving to Pentium, prior to that all of their chipset products (486) supported it. 1995 artificial market segmentation shenanigans https://www.pctechguide.com/chipsets/intels-triton-chipsets-...
They did support ECC on some i3 simply because they did not bother to double the sku, however IIRC you need the server / WS S chipset to enable it. At which point just put an entry level Xeon on that.
In the absolute the cost of ECC everywhere would not be substantially greater than the prices we have now without. The current ECC prices are high because it is not broadly used, and not really the inverse. Consumer skip it because it is fucking hard to get ECC enable parts for S SKUs (or H / U) in the current situation, while there are plenty of non-ECC vendors and resellers, and something like at least 3 times the number of SKUs. And consumers have not been informed they are buying unreliable shit.
> Intel has offered ECC support in a lot of their low-end i3 parts for a long time. They’re popular for budget server builds for this reason.
Intel removed ECC support in the 10th gen so you have to go for Xeon nowadays.
With DDR5 you can have (a form of) ECC on all current 12th-generation Core CPUs. That is, if you were able to find DDR5 DIMMs on the market, which you currently cannot.
5 replies →
As far as I can tell, Intel only offered ECC on a small handful of i3 parts that mainly seemed to be marketed to NAS manufacturers, likely because they were otherwise giving up that market entirely to competitors like AMD. They really don't seem to be interested in offering it as an option on consumer desktops.
"pretending we can have 512 billion bits of perfect memory sitting around that will never go wrong, because Intel fuse it off on desktop chips"
I think computers are now so important to our life, we need to start regulating them like we do cars.
Start seriously slapping companies that deliberately or negligently release equipment with obsolete kernels and security holes, mandate ECC like we mandate ABS, mandate part avaliability for 10 years like we do with cars, etc.
Every day we let this this slide, thousands of people loose precious data and number of 'smart' toasters mining crypto increases.
My main worry with this sort of thing, is that if we start mandating legal liability, and security becomes a compliance line-item, then companies are going to start locking down everything they ship so they have a legal defense in court. The argument's going to be, "if we are liable for shipping insecure desktops then you shouldn't be allowed to install Linux onto them and then sue us when you get hacked".
Think about how many laptops ship with Wi-Fi whitelists with the excuse of "FCC certification". It doesn't matter that the FCC doesn't actually prohibit users from swapping out Wi-Fi cards; manufacturers will do it anyway.
Just add a physical seal on the product like other dumb electronics do.
> AMD supports ECC on their consumer chips
And now the next desktop consumer upgrade I purchase will be AMD and will have ECC (well... unless it's way more expensive).
Since the Mac Pro has ECC Ram, I would expect a future Apple Silicon Mac Pro to offer it as well with its desktop M1 chip, with the functionality trickling down the line in years to come.
DDR5 is a form of ECC and DDR5 is only supported on Intel so far.
The DDR5 memory bus used by Intel's latest consumer processors does not have ECC enabled. The memory dies themselves have some internal ECC that is not exposed to the host system and is not related to the fact that they use a DDR5 interface; all state of the art DRAM now needs on-die ECC due to the high density.
So what it has on die ECC which allows to recover from radiation induced bitflips and stuff. Maybe to compensate for density the error correction is a bit more busy and can compensate less errors per minute but 0.5 ECC instead of full ECC on DDR4 (no random errors due to density) is still an improvement for most people in terms of immunity to unlucky cosmic rays.
> Don't forget it's not just instruction sets; Intel is the reason we don't have ECC RAM on desktops.
Of course we do: workstations.
It's cheaper, that's why it isn't everywhere.
Intel's lower end workstation chips are the same silicon, and thus the same manufacturing cost, as their desktop chips. They just artificially disable features like ECC for product segmentation. It is unconscionable that something as essential as ECC is crippled out of the consumer line-up.
Except that the memory chips and motherboards also need to support ECC
3 replies →