Comment by oytis
2 days ago
The press release talks a lot about how it was done, but very little about how capabilities compare to other open models.
2 days ago
The press release talks a lot about how it was done, but very little about how capabilities compare to other open models.
It's a university, teaching the 'how it's done' is kind of the point
Sure, but usually you teach something that is inherently useful, or can be applied to some sort of useful endeavor. In this case I think it's fair to ask what the collision of two bubbles really achieves, or if it's just a useful teaching model, what it can be applied to.
The model will be released in two sizes — 8 billion and 70 billion parameters [...]. The 70B version will rank among the most powerful fully open models worldwide. [...] In late summer, the LLM will be released under the Apache 2.0 License.
We'll find out in September if it's true?
Yeah, I was thinking more of a table with benchmark results
I hope DeepSeek R2, but I fear Llama 4.