Comment by tedivm
4 hours ago
Storing on GPU would be the absolute dumbest thing they could do. Locking up the GPU memory for a full hour while waiting for someone else to make a request would result in essentially no GPU memory being available pretty rapidly. This type of caching is available from the cloud providers as well, and it isn't tied to a single session or GPU.
No comments yet
Contribute on Hacker News ↗