← Back to context

Comment by hydroreadsstuff

13 days ago

This means GPUs are dead for local enthusiast AI. And SoCs with big RAM are in.

Because 17B active parameters should reach enough performance on 256bit LPDDR5x.

This has been the case for a while now. 3090 hoarders were always just doing it for street cred or whatever, no way these guys are computing anything of actual value.

Tenstorrent is on fire, though. For small businesses this is what matters. If 10M context is not a scam, I think we'll see SmartNIC adoption real soon. I would literally long AMD now because their Xilinx people are probably going to own the space real soon. Infiniband is cool and all, but it's also stupid and their scale-out strategy is non-existent. This is why https://github.com/deepseek-ai/3FS came out but of course nobody had figured it out because they still think LLM's is like, chatbots, or something. I think we're getting to a point where it's a scheduling problem, basically. So you get like like lots of GDDR6 (HBM doesnn't matter anymore) as L0, DDR5 as L1, and NVMe-oF is L2. Most of the time the agents will be running the code anyway...

This is also why Google never really subscribed to "function calling" apis

  • I was going to buy my first GPU for DL in 2018, but crypto didn't make it easy. I waited for the prices to fall, but demand kept up, then covid happened, then LLM happened and used GPUs now cost more than their original new prices. ... as we can see by the paper launch from Nvidia, lack of competition, and the prices of the 5000 series easily 50% above original MSRP. Demand is still here, now we have tarrif... Folks got reasons to collect, hoard or do whatever you think they are doing, even if it's just for street cred.

  • Not a hoarder per-se but I bought a 24GB card on the secondary market. My privacy is valuable. I'm okay being a half-step or full-step behind in LLM or image diffusion if it means my data never leaves my machine.

    • If you really were serious about privacy, you wouldn't put yourself at disadvantage with a locked-down six-year out of date card. Tenstorrent Blackhole exists now, btw.

      5 replies →