Comment by Herring

13 hours ago

Ok then I look forward to seeing DeepSeek running instantly at the end of April.

4 comments

Herring

Why so negative lol. The speed and very reduced power use of this thing are nothing to be sneezed at. I mean, hardware accelerated LLMs are a huge step forward. But yeah, this is a proof of concept, basically. I wouldn't be surprised if the size factor and the power use go down even more, and that we'll start seeing stuff like this in all kinds of hardware. It's an enabler.

Herring 4 hours ago
You don't know. You just have marketing materials, not independent analysis. Maybe it actually takes 2 years to design and manufacture the hardware, so anything that comes out will be badly out of date. Wouldn't be the first time someone lied. A good demo backed by millions of dollars should not allow such doubts.
- akie 32 minutes ago
  
  Did you not see the chatbot they posted online (https://chatjimmy.ai/)? That thing is near instantaneous, it's all the proof you need that this is real.
  And if the hardware is real and functional, as you can independently verify by chatting with that thing, how much more effort would it be to etch more recent models?
  The real question is of course: what about LARGER models? I'm assuming you can apply some of the existing LLM inference parallelization techniques and split the workload over multiple cards. Some of the 32B models are plenty powerful.
  It's a proof of concept, and a convincing one.