Comment by amelius
21 hours ago
It could be that your problem was too simple to justify the use of Deep Think.
But yes, Google should have figured that out and used a less expensive mode of reasoning.
21 hours ago
It could be that your problem was too simple to justify the use of Deep Think.
But yes, Google should have figured that out and used a less expensive mode of reasoning.
"I'm sorry but that wasn't a very interesting question you just asked. I'll spare you the credit and have a cheaper model answer that for you for free. Come back when you have something actually challenging."
Actually why not? Recognizing problem complexity as a fist step is really crucial for such expensive "experts". Humans do the same.
And a question to the knowledgeable: does a simple/stupid question cost more in terms of resources then a complex problem? in terms of power consumption.
Google actually does provide that service! https://cloud.google.com/vertex-ai/generative-ai/docs/model-...
IIRC that isn't possible under current models at least in general, for multiple reasons, including attention cannot attend to future tokens, the fact that they are existential logic, that they are really NLP and not NLU, etc...
Even proof mining and the Harrop formula have to exclude disjunction and existential quantification to stay away from intuitionist math.
IID in PAC/ML implies PEM which is also intentionally existential quantification.
This is the most gentle introduction I know of, but remember LLMs are fundamentally set shattering, and produce disjoint sets also.
We are just at reactive model based systems now, much work is needed to even approach this if it ever is even possible.
[0] https://www.cmu.edu/dietrich/philosophy/docs/tech-reports/99...
2 replies →
Just put in the prompt customization to model responses on Marvin from Hitchhiker's Guide.
"Here I am, brain the size of a planet, and they ask me to ..."
Do they not do this in some general way behind the scenes? Surprising to me if not.
> ... Recognizing problem complexity as a first step...
Well, I don't think it's easy or even generally possible to recognize a problem complexity. Imagine you ask for a solution for a simple expressed statement like find an n > 2 where z^n = x^n + y^n. The answer you will receive will be based on a trained model with this well known problem but if it's not in the model it could be impossible to measure its complexity.
I know this is a joke but I have been able to lower my costs by routing my prompts to a smaller model to determine if I need to send it to a larger model or not.
“This meeting could’ve been an email”
Model routing is deceptively hard though. It has halting problem characteristics: often only the smartest model is smart enough to accurately determine a task's difficulty. And if you need the smartest model to reliably classify the prompt, it's cheaper to just let it handle the prompt directly.
This is why model pickers persist despite no one liking them.
Yes but prompt evaluation is far faster than inference as it can be done (mostly) in parallel, so I don't think that's true.
The problem is that input token cost dominates output token cost for the majority of tasks.
Once you've given the model your prompt and are reading the first output token for classification, you've already paid most of the cost of just prompting it directly.
That said, there could definitely be exceptions for short prompts where output costs dominate input costs. But these aren't usually the interesting use cases.
7 replies →
but if the less strong model has low false positives you can just route them in order of strength
That's a very big "if".