Comment by niothiel
2 months ago
Actually actively exploring this very topic! I have a feature-flag version where the inference runs via WASM / WebGPU (onnxruntime-web specifically).
My only pause behind rolling this out further is the performance isn't as fast as I'd like (1.5s~ latencies), and the widely varying support for WebGPU / WASM across browsers and OS pairs.
Still testing it out (and learning about ViT performance on various hardware), so hopefully more news on that front soon!
No comments yet
Contribute on Hacker News ↗