Comment by GrantMoyer
2 days ago
The idea is that NPUs are more power efficient for convolutional neural network operations. I don't know whether they actually are more power efficent, but it'd be wrong to dismiss them just because they don't unlock new capabilties or perform well for very large models. For smaller ML applications like blurring backgrounds, object detection, or OCR, they could be beneficial for battery life.
Yes, the idea before the whole shove LLMs into everything era was that small, dedicated models for different tasks would be integrated into both the OS and applications.
If you're using a recent phone with a camera, it's likely using ML models that may or may not be using AI accelerators/NPUs on the device itself. The small models are there, though.
Same thing with translation, subtitles, etc. All small local models doing specialized tasks well.
OCR on smartphones is a clear winner in this area. Stepping back, it's just mind blowing how easy it is to take a picture of text and then select it and copy and paste it into whatever. And I totally just take it for granted.
Not sure about all NPUs, but TPUs like Google's Coral accelerator are absolutely, massively more efficient per watt than a GPU, at least for things like image processing.