Comment by gdiamos
5 days ago
Most people don't appreciate how many dead end applications NVIDIA explored before finding deep learning. It took a very long time, and it wasn't luck.
5 days ago
Most people don't appreciate how many dead end applications NVIDIA explored before finding deep learning. It took a very long time, and it wasn't luck.
It was luck that a viable non-graphics application like deep learning existed which was well-suited to the architecture NVIDIA already had on hand. I certainly don't mean to diminish the work NVIDIA did to build their CUDA ecosystem, but without the benefit of hindsight I think it would have been very plausible that GPU architectures would not have been amenable to any use cases that would end up dwarfing graphics itself. There are plenty of architectures in the history of computing which never found a killer application, let alone three or four.
Even that is arguably not lucky, it just followed a non-obvious trajectory. Graphics uses a fair amount of linear algebra, so people with large scale physical modeling needs (among many) became interested. To an extent the deep learning craze kicked off because of developments in computation on GPUs enabled economical training.
Nvidia started their GPGPU adventure by acquiring a physics engine and porting it over to run on their GPUs. Supporting linear algebra operations was pretty much the goal from the start.
9 replies →
There's something of a feedback loop here, in that the reason that transformers and attention won over all the other forms of AI/ML is that they worked very well on the architecture that NVIDIA had already built, so you could scale your model size very dramatically just by throwing more commodity hardware at it.
It was luck, but that doesn't mean they didn't work very hard too.
Luck is when preparation meets opportunity.
It was definitely luck, greg. And Nvidia didn't invent deep learning, deep learning found nvidias investment in CUDA.
I remember it differently. CUDA was built with the intention of finding/enabling something like deep learning. I thought it was unrealistic too and took it on faith in people more experienced than me, until I saw deep learning work.
Some of the near misses I remember included bitcoin. Many of the other attempts didn't ever see the light of day.
Luck in english often means success by chance rather than one's own efforts or abilities. I don't think that characterizes CUDA. I think it was eventual success in the face of extreme difficulty, many failures, and sacrifices. In hindsight, I'm still surprised that Jensen kept funding it as long as he did. I've never met a leader since who I think would have done that.
Nobody cared about deep learning back in 2007, when CUDA released. It wasn't until the 2012 AlexNet milestone that deep neural nets start to become en vogue again.
2 replies →
CUDA was profitable very early because of oil and gas code, like reverse time migration and the like. There was no act of incredible foresight from jensen. In fact, I recall him threatening to kill the program if large projects that made it not profitable failed, like the Titan super computer at oak ridge.
1 reply →
So it could just as easily have been Intel or AMD, despite them not having CUDA or any interest in that market? Pure luck that the one large company that invested to support a market reaped most of the benefits?