Comment by numpad0
2 months ago
Even in tensor parallel modes? I thought it could only work if you're fine stalling all but n GPU for n users at any given moments.
2 months ago
Even in tensor parallel modes? I thought it could only work if you're fine stalling all but n GPU for n users at any given moments.
No comments yet
Contribute on Hacker News ↗