Comment by jadbox
6 days ago
Using llama.cpp and the 9b q4 xl model, it is on Thinking mode by default and runs without stopping. The only way to force it to stop is to set the thinking budget to -1. (Which is weird as the docs say 0 should be valid)
No comments yet
Contribute on Hacker News ↗