Comment by fnord123
10 days ago
They aren't arguing against LLMs They are arguing against their toaster's LLM to make the perfect toast from being trained on the tax policies of the Chang Dynasty.
10 days ago
They aren't arguing against LLMs They are arguing against their toaster's LLM to make the perfect toast from being trained on the tax policies of the Chang Dynasty.
Thing is, we keep finding out again and again that having a very broad training mix in the baseline model makes it better across the board, including in those specialized tasks when you fine-tune it.
As I understand it, the general ability to reason is what the models get out of "being trained on the tax policies of the Chang Dynasty", and we haven't really figured out a better way to do so than to throw most everything at them. And even if all you do is make toast, you still need some intelligence.
I'm aware! And I'm personally excited about small models but my intuition is that maybe pouring more and more money into giant general purpose models will have payoff as long as it keeps working at producing better general purpose results (which maybe it won't).