Comment by numbers_guy

8 hours ago

Does anyone know why they brand it an "inference chip"? Is it something at the hardware level that makes is unsuitable for training, or is it simply that the toolchain for training is massively more complicated to program?

3 comments

numbers_guy

yaantc 8 hours ago

Very simplified, AI workloads need compute and communications and compute dominates inference, while communications dominate training.

Most start-ups innovate on the compute side, whereas the techno needed for state of the art communications is not common, and very low-level: plenty of analog concerns. The domain is dominated by NVidia and Broadcom today.

This is why digital start-ups tend to focus on inference. They innovate on the pure digital part, which is compute, and tend to use off-the-shelf IPs for communications, so not a differentiator and likely below the leaders.

But in most cases coupling a computation engine marketed for inference with state of the art communications would (in theory) open the way for training too. It's just that doing both together is a very high barrier. It's more practical to start with compute, and if successful there use this to improve the comms part in a second stage. All the more because everyone expects inference to be the biggest market too. So AI start-ups focus on inference first.

Fnoord 4 hours ago

They also have the 'tyr 4' [1].
It doesn't have to compete on price 1:1. Ever since Trump took office, the Europeans woke up on their dependence on USA who they no longer regard as a reliable partner. This counts for defense industry, but also for critical infrastructure, including IT. The European alternatives are expected to cost something.
[1] https://vsora.com/products/tyr/

IshKebab 7 hours ago

Probably because their software only supports inference. It's relatively easy to do via ONNX. Training requires an order of magnitude more software work.