Comment by storystarling
17 hours ago
An RTF above 1 for just 0.6B parameters suggests the bottleneck isn't the GPU, even on a 1080. The raw compute should be much faster. I'd bet it's mostly CPU overhead or an issue with the serving implementation.
17 hours ago
An RTF above 1 for just 0.6B parameters suggests the bottleneck isn't the GPU, even on a 1080. The raw compute should be much faster. I'd bet it's mostly CPU overhead or an issue with the serving implementation.
No comments yet
Contribute on Hacker News ↗