← Back to context

Comment by colechristensen

4 days ago

Sure, interacting with natural language without expectation that the model contains knowledge. Good for things like tool use and embeddings where the information is all retrieved.

Are these small models are trained to privilege "raw intelligence" over factual knowledge? Is there any indication of how much of current model is dedicated to the knowledge of multiple languages and tons of facts rather than pure understanding and reasoning?

  • The evaluations provide this indication. You'll see MMLU, GPQA, Big Bench etc in reports for many models. Those numbers provide the indication you're looking for.

    To answer a question you didn't ask. With small models especially we need to make choices as to which to focus on. For this model we focused on text summarization and instruction following, with the idea that users would finetune to gain performance on the task set that is relevant to them