Comment by guerrilla

6 months ago

So, is the output price there why most models are extremely verbose? Is it just a ploy to make extra cash? It's super annoying that I have to constantly tell it to be more and more concise.

1 comment

guerrilla

diggan 6 months ago

> It's super annoying that I have to constantly tell it to be more and more concise.

While system promting is the easy way of limiting the output in a somewhat predictable manner, have you tried setting `max_tokens` when doing inference? For me that works very well for constraining the output, if you set it to 100 you get very short answers while if you set it to 10,000 you can very long responses.