Comment by vlovich123
7 hours ago
Not really. GPUs are stateless so your bounded lifetime regardless of how much you use them is the lifetime of the shitties capacitor on there (essentially). Modulo a design defect or manufacturing defect, I’d expect a usable lifetime of at least 10 years, well beyond the manufacturer’s desire to support the drivers for it (ie the sw should “fail” first).
The silicon itself does wear out. Dopant migration or something, I'm not an expert. Three years is probably too low but they do die. GPUs dying during training runs was a major engineering problem that had to be tackled to build LLMs.