← Back to context

Comment by adastra22

3 days ago

NPUs are more energy efficient. There is no doubt that a systolic array uses less watts per computation than a tensor operation on a GPU, for these kinds of natural fit applications.

Are they more performant? Hell no. But if you're going to do the calculation, and if you don't care about latency or throughput (e.g. batched processing of vector encodings), why not use the NPU?

Especially on mobile/edge consumer devices -- laptops or phones.

> NPUs are more energy efficient. There is no doubt

Maybe because they sleep all the time. To be able to use an NPU you need at least a compiler which generates code for this particular NPU and a CPU scheduler which can dispatch instructions to this NPU.