Comment by cyanydeez
4 hours ago
likely the small model makes whatever fuzzer they designed to poke the gpus much faster optimizations.
they seem to think it scales up because theyre shortening the stack.
4 hours ago
likely the small model makes whatever fuzzer they designed to poke the gpus much faster optimizations.
they seem to think it scales up because theyre shortening the stack.
No comments yet
Contribute on Hacker News ↗