Comment by canyon289
10 days ago
This is a key question we faced when building this model. It comes down to basically to "how good" to you need to be at "how many things". We had to make some choices with this model and do our best to maximize performance in those areas.
To answer this more precisely its a matter of choosing different data and training regimes and checking performance with evals.
And to make this fully concrete you're welcome to give it a try! Train this model on a taskset of your choice and measure the performance tradeoffs. You'll get a good sense of how LLM capabilities shift
No comments yet
Contribute on Hacker News ↗