← Back to context

Comment by tux1968

11 hours ago

It would be very helpful to deeply understand the truth behind this management failing. The actual players involved, and their thinking. Was it truly a blind spot? Or was it mistaken priorities? I mean, this situation has been so obvious and tragic, that I can't help feeling like there is some unknown story-behind-the-story. We'll probably never really know, but if we could, I wouldn't spend quite as much time wearing a tinfoil hat.

if you asked AMD execs they'd probably say they never had the money to build out a software team like NVIDIA's. that might only be part of the answer. the rest would be things like lack of vision, "can't turn a tanker on a dime", etc.

  • I don't buy that story. NVIDIA wasn't that huge of a company when they built CUDA, they weren't huge when the first GPT model was trained with it.

  • Has to be lack of vision. I refuse to believe it's impossible to _do_, but it sounds like it's impossible to _specify_ within AMD. Like they're genuinely incapable of working out what the solution might look like.

  • Nobody is asking AMD to rebuild the entire NVidia ecosystem. Most people just want to run GPGPU code or ML code on AMD GPUs without the entire computer crashing on them.

    • yeah it's a very frustrating situation.

      according to public information NVIDIA started working on CUDA in 2004, that was before AMD made the ATI acquisition.

      my suspicion is that back then ATI and NVIDIA had very different orientations. neither AMD nor ATI were ever really that serious about software. so in that sense i guess it was a match made in heaven.

      so you have a cultural problem, which is bad enough, then you add in the lean years AMD spent in survival mode. forget growing software team, they had to cling on to fewer people just to get through.

      now they're playing catch-up in a cutthroat market that's moving at light speed compared to 20 years ago.

      we're talking about a major fumble here so it's easy to lose context and misunderstand things were a little more complex than they appeared.

My guess is it’s just incompetence. Imagine you’re in charge of ROCm and your boss asks you how it’s going. Do you say good things about your team and progress? Do you highlight the successes and say how you can do all the major things CUDA can? I think many people would. Or do you say to your boss “the project I’m in charge of is a total disaster and we are a joke in the industry”? That’s a hard thing to say.

  • a 10 year lead can't be closed overnight but Intel had a even larger lead and look how the mighty have fallen.

    • Intel was never famous for good GPUs, and they are basically the only ones still trying to make something out of OpenCL, with most of the tooling going beyond what Khronos offers.

      one API is much more than a plain old SYCL distribution, and still.

      9 replies →

  • > My guess is it’s just incompetence.

    maybe on some level but not that level you're describing. pretty much everyone at AMD understands the situation, and has for a while.