Comment by aswegs8
20 hours ago
Not sure why this "GPUs obsolete after 3 years" gets thrown around all the time. Sounds completely nonsensical.
20 hours ago
Not sure why this "GPUs obsolete after 3 years" gets thrown around all the time. Sounds completely nonsensical.
Especially since AWS still have p4 instances that are 6 years old A100s. Clearly even for hyperscalers these have a useful life longer than 3 years.
Economically obsolete, not obsolete, I suspect this is in line with standard depreciation.
I agree that there is hyperbole thrown around a lot here and its possible to still use some hardware for a long time or to sell it and recover some cost but my experience in planning compute at large companies is that spending money on hardware and upgrading can often result in saving money long term.
Even assuming your compute demands stay fixed, its possible that a future generation of accelerator will be sufficiently more power/cooling efficient for your workload that it is a positive return on investment to upgrade, more so when you take into account you can start depreciating them again.
If your compute demands aren't fixed you have to work around limited floor space/electricity/cooling capacity/network capacity/backup generators/etc and so moving to the next generation is required to meet demand without extremely expensive (and often slow) infrastructure projects.
Sure, but I don't think most people here are objecting to the obvious "3 years is enough for enterprise GPUs to become totally obsolete for cutting-edge workloads" point. They're just objecting to the rather bizarre notion that the hardware itself might physically break in that timeframe. Now, it would be one thing if that notion was supported by actual reliability studies drawn from that same environment - like we see for the Backblaze HDD lifecycle analyses. But instead we're just getting these weird rumors.
It's because they run 24/7 in a challenging environment. They will start dying at some point and if you aren't replacing them you will have a big problem when they all die en masse at the same time.
These things are like cars, they don't last forever and break down with usage. Yes, they can last 7 years in your home computer when you run it 1% of the time. They won't last that long in a data center where they are running 90% of the time.
A makeshift cryptomining rig is absolutely a "challenging environment" and most GPUs by far that went through that are just fine. The idea that the hardware might just die after 3 years' usage is bonkers.
Crypto miners undervote for efficiency GPUs and in general crypto mining is extremely light weight on GPUs compared to AI training or inference at scale
With good enough cooling they can run indefinitely!!!!! The vast majority of failures are either at the beginning due to defects or at the end due to cooling! It’s like the idea that no moving parts (except the HVAC) is somehow unreliable is coming out of thin air!