Comment by wtallis
5 days ago
It was luck that a viable non-graphics application like deep learning existed which was well-suited to the architecture NVIDIA already had on hand. I certainly don't mean to diminish the work NVIDIA did to build their CUDA ecosystem, but without the benefit of hindsight I think it would have been very plausible that GPU architectures would not have been amenable to any use cases that would end up dwarfing graphics itself. There are plenty of architectures in the history of computing which never found a killer application, let alone three or four.
Even that is arguably not lucky, it just followed a non-obvious trajectory. Graphics uses a fair amount of linear algebra, so people with large scale physical modeling needs (among many) became interested. To an extent the deep learning craze kicked off because of developments in computation on GPUs enabled economical training.
Nvidia started their GPGPU adventure by acquiring a physics engine and porting it over to run on their GPUs. Supporting linear algebra operations was pretty much the goal from the start.
They were also full of lies when they have started their GPGPU adventure (like also today).
For a few years they have repeated continuously how GPGPU can provide about 100 times more speed than CPUs.
This has always been false. GPUs are really much faster, but their performance per watt has oscillated during most of the time around 3 times and sometimes up to 4 times greater in comparison with CPUs. This is impressive, but very far from the "100" factor originally claimed by NVIDIA.
Far more annoying than the exaggerated performance claims, is how the NVIDIA CEO was talking during the first GPGPU years about how their GPUs will cause a democratization of computing, giving access for everyone to high-throughput computing.
After a few years, these optimistic prophecies have stopped and NVIDIA has promptly removed FP64 support from their price-acceptable GPUs.
A few years later, AMD has followed the NVIDIA example.
Now, only Intel has made an attempt to revive GPUs as "GPGPUs", but there seems to be little conviction behind this attempt, as they do not even advertise the capabilities of their GPUs. If Intel will also abandon this market, than the "general-purpose" in GPGPUs will really become dead.
7 replies →
That physics engine is an example of a dead-end.
There's something of a feedback loop here, in that the reason that transformers and attention won over all the other forms of AI/ML is that they worked very well on the architecture that NVIDIA had already built, so you could scale your model size very dramatically just by throwing more commodity hardware at it.