Comment by temac

4 years ago

Not really: internal ECC in DDR5 is an implementation detail that is neither exposed on the bus nor giving you the real reliability and monitoring capability that real ECC terminated in the memory controller did. It is only there because the error rate would be absolutely horrific without, so you need internal ECC to get to basically the same point you were without ECC on DDR4.

I expect in-chip ECC should still be a significant improvement for RAM reliability (any ECC is going to be better than none, even if your memory array is significantly worse; I've had my share of RAM with weak bits that would absolutely be fixed with that), but it's not going to help with bus errors and isn't nearly as transparent to system software as end to end ECC is.

  • In theory some weak ECC on top of particularly unreliable storage can still be less reliable than way more reliable storage not employing any ECC, but I also suspect this won't be the case here. However, if the target reliability is only say 2 or 3 times what you had DDR4 without ECC, it is still completely unsuitable for serious applications. And really we should find another name for the internal ECC of DDR5, because the services it provides is completely different from real ECC.

    • In the industry we already have terms of art that are better than just “ECC”. Normally we speak of EDAC, error detection and correction. We refer to them by their capabilities, such as SECDED, ChipKill, or whatever.