Comment by tucnak
14 days ago
This has been the case for a while now. 3090 hoarders were always just doing it for street cred or whatever, no way these guys are computing anything of actual value.
Tenstorrent is on fire, though. For small businesses this is what matters. If 10M context is not a scam, I think we'll see SmartNIC adoption real soon. I would literally long AMD now because their Xilinx people are probably going to own the space real soon. Infiniband is cool and all, but it's also stupid and their scale-out strategy is non-existent. This is why https://github.com/deepseek-ai/3FS came out but of course nobody had figured it out because they still think LLM's is like, chatbots, or something. I think we're getting to a point where it's a scheduling problem, basically. So you get like like lots of GDDR6 (HBM doesnn't matter anymore) as L0, DDR5 as L1, and NVMe-oF is L2. Most of the time the agents will be running the code anyway...
This is also why Google never really subscribed to "function calling" apis
I was going to buy my first GPU for DL in 2018, but crypto didn't make it easy. I waited for the prices to fall, but demand kept up, then covid happened, then LLM happened and used GPUs now cost more than their original new prices. ... as we can see by the paper launch from Nvidia, lack of competition, and the prices of the 5000 series easily 50% above original MSRP. Demand is still here, now we have tarrif... Folks got reasons to collect, hoard or do whatever you think they are doing, even if it's just for street cred.
Tenstorrent
Not a hoarder per-se but I bought a 24GB card on the secondary market. My privacy is valuable. I'm okay being a half-step or full-step behind in LLM or image diffusion if it means my data never leaves my machine.
If you really were serious about privacy, you wouldn't put yourself at disadvantage with a locked-down six-year out of date card. Tenstorrent Blackhole exists now, btw.
Be serious now. Plenty of useful, privacy-required queries can be run with 24GB of VRAM, especially given the existence of e.g. Gemma 3 27B and the heavy NVIDIA-targeted optimisation work that has occurred.
The Tenstorrent cards exist, but are low in availability and the software is comparatively nonexistent. I'm excited for them too, but at the end of the day, I can buy a used 3090 today and do useful work with it, while the same is not true of TT yet.
I think it's disingenuous to suggest they're putting themselves at a disadvantage with an RTX 3090, especially in a comparison to an inferior product that isn't even shipping yet.
RTX 3090: 24GB RAM, 936.2GB/s bandwidth
Tenstorrent p150a: 32GB RAM, 512GB/s bandwidth
an extra 8GB of ram isn't worth nearly halving memory bandwidth.
3 replies →
> Infiniband is cool and all, but it's also stupid and their scale-out strategy is non-existent.
god I love this website.
Keyword: compute-in-network
Not sure what you’re suggesting. I’m well aware that things like SHARP exist.