Comment by rafaquintanilha
1 day ago
I have no affiliation with them but here's what I think happened:
1. They claim the official model is based on Qwen 397B. It's likely they didn't disclose Nex Pro at all because Nex itself is based on the same base model (not saying they shouldn't).
2. The improvement would come from merging the weights PLUS on-policy distillation. The confusion is that the uploaded model didn't have the distillation at all.
3. It's important to notice they didn't advertise the model besides posting it on Reddit 2 days ago. It became viral organically, over the weekend, and during Brazil's World Cup debut (Brazilians will understand). Of course the mayor of Rio took the opportunity to capitalize over the free coverage, but that wasn't done in conjunction with the researchers.
4. I don't see why they would disclose Qwen 397B as base and mention the SwiReasoning paper but not mention Nex if all they did was to merge both models.
5. In any case, what they are claiming is easily verifiable once (if) they upload the right model.
Regarding #2
https://news.ycombinator.com/item?id=48529544
This should be at the top: they uploaded the wrong model, they fixed it
They did upload the wrong model but as of the time of writing they have not fixed it. Right now, 12 hours after they took the old one down, there is simply no model present in their huggingface repo.
1 reply →
I'm honestly impressed that this even happened at all. "Rio de Janeiro's homegrown LLM" is probably the last headline I ever expected to read on HN.
Worth reminding everyone that Lua was also created in Rio, though admittedly at PUC rather than by the government.
Rio has a strong engineering talent pool, along with many other major capitals in Brazil
Brazil does have talent. Mauro Carvalho Chehab is a Linux kernel maintainer. Elixir was created by José Valim, a brazilian. I have also created my own programming language.
What Brazil doesn't have is a history of properly rewarding talent, which often causes it to migrate elsewhere. So it's definitely surprising when any sort of technological development happens in Brazil: it implies someone who stayed managed to get something done, most likely for much less than what that something is actually worth, while also being crushed by extremely high taxes that essentially doubles the cost of computer hardware.
17 replies →
Yes. Though even more than the US, their engineering talent from top schools heads into consulting and finance.
Yes! That "prefeitura do Rio" huggingface URL is definitely shocking to read to this Brazilian as well (I'm assuming you and parent also are from your usernames).
It seems to me this is clearly a mistake. They would not even have the resources for it as far as I know and I think they are not even on a position to such bold claims.
> 2. The improvement would come from merging the weights PLUS on-policy distillation. The confusion is that the uploaded model didn't have the distillation at all.
They merged the base model with another lab’s fine tuned model. The improvements could have come from getting some of the fine tuned weights from the other model.
If they really had a better performing model that they “accidentally” forgot to upload, they could have uploaded the correct file by now.
Seems they did
https://news.ycombinator.com/item?id=48529544
I only see an edit to the readme (13h ago) and removal of the weights, so the repo is now empty.
I am willing to give them the benefit of the doubt, but we've seen this before: a model gets released that is supposedly state-of-the-art, yet seems to be a an other repackaged model without any training. Reflection 70B was the most similar example, all they now need is an api that rewrites "Claude" to "Rio".
What do you mean World Cup debut? haven't they won 5?
They meant their first, opening game of this current World Cup tournament
My understanding is that they didnt do any distalation. Tevery weight is a 60/40 element wise average of QWEN and NEX. Is this possible if the rio contracter did thei own post-training as claimed?
https://x.com/tenobrus/status/2066243352211996728/photo/1