Comment by gardnr

17 hours ago

> The training and deployment of LongCat-2.0 are built on large-scale clusters of tens of thousands of AI ASIC superpods. Compared to the mature Nvidia GPU ecosystem, the supporting software community is still less developed. We have therefore put significant effort into building a stable, secure, and scalable infrastructure.

This is the real news story. It looks like they may have used Huawei Ascend 910C chips: https://nitter.net/teortaxesTex/status/2071708141037781407#m

8 comments

gardnr

BoorishBears 14 hours ago

If they really managed this from pre-training a 1.6 T parameter model through to post-training without NVIDIA, Dwarkesh Patel got what he wanted.

chvid 12 hours ago

It is interesting how much people doubt Huawei’s capabilities in this area - Jensen does not (in the dp interview) - of course you can dismiss this as him talking his own book.
Jabrov 14 hours ago
Who? What did he want?
- gardnr 13 hours ago
  
  Dwarkesh Patel has AI/ML guests on his podcast. BoorishBears may have been referring to the Jensen Huang episode where they discuss TPUs: https://youtu.be/Hrbq66XqtCo?t=982
  
  2 replies →

jingpostmedia 5 hours ago

[flagged]